Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected change of input_ids during generation of several samples with transformers #1048

Open
RobinPicard opened this issue Jul 18, 2024 · 1 comment
Labels

Comments

@RobinPicard
Copy link
Contributor

RobinPicard commented Jul 18, 2024

Describe the issue as clearly as possible:

Encountered while working on PR 531

When generating several samples for a prompt with the transformers model, the input_ids gets modified during generation without affecting the final result. This causes problems in the FSMLogitsProcessor.process_logits method as the keys of self._fsm_states depend on the values of input_ids. This is easier to understand by just looking at the example below.

I suppose it's related to "- Ensure FSMLogitsProcessor allows unstable sequence ordering (beam
search in transformers and vLLM change the order of sequences)" mentioned in this commit by @lapp0, but here it's not just a change of order between the 1st and the 2nd sequence.

Steps/code to reproduce the bug:

import outlines.models as models
import outlines.generate as generate
import outlines.samplers as samplers

model = models.transformers("hf-internal-testing/tiny-random-gpt2", device="cpu")
generator = generate.regex(model, r"([a-z]{3})@", sampler=samplers.beam_search(2))
output = generator(["123"], max_tokens=40)
print(output)

At the beginning of OutlinesLogitsProcessor.__call__, add:

print(input_ids[0])
print(input_ids[1])
print(self.tokenizer.decode(input_ids[0]))
print(self.tokenizer.decode(input_ids[1]))
print('')

Expected result:

Not have 320 turn into 491 and 'ers' turn into 'age' at the 4th step of the generation for the 2nd sample

Error message:

tensor([17, 18, 19])
tensor([17, 18, 19])
['1', '2', '3']
['1', '2', '3']

tensor([ 17,  18,  19, 491])
tensor([ 17,  18,  19, 320])
['1', '2', '3', 'age']
['1', '2', '3', 'ers']

tensor([ 17,  18,  19, 491,  32])
tensor([ 17,  18,  19, 320,  32])
['1', '2', '3', 'age', '@']
['1', '2', '3', 'ers', '@']

tensor([ 17,  18,  19, 491,  32,   2])
tensor([ 17,  18,  19, 491,  32,   1])
['1', '2', '3', 'age', '@', '"']
['1', '2', '3', 'age', '@', '!']

[['age@', 'ers@']]

Outlines/Python version information:

Current main branch (latest commit 9ce0df3)

Context for the issue:

No response

@lapp0
Copy link
Contributor

lapp0 commented Jul 19, 2024

I've been looking into this issue.

I'm not sure why, but it appears that transformers with beam search calls the logits processor with a sequence it didn't actually sample only for the last token, then throws away the last sequence without impacting generation. Is there something I'm missing?

Here's an example with a modified pattern requiring 3 @

['1', '2', '3']
['1', '2', '3']
['1', '2', '3', 'age']
['1', '2', '3', 'ers']
['1', '2', '3', 'age', '@']
['1', '2', '3', 'ers', '@']
['1', '2', '3', 'age', '@', '@']
['1', '2', '3', 'ers', '@', '@']
['1', '2', '3', 'age', '@', '@', '@']
['1', '2', '3', 'ers', '@', '@', '@']
['1', '2', '3', 'age', '@', '@', '@', '!']
['1', '2', '3', 'age', '@', '@', '@', '#']
age@@@
ers@@@

With 4 samples:

['1', '2', '3', 'age', '@', '@', '@']
['1', '2', '3', 'ag', 'n', '@', '@']
['1', '2', '3', 'ag', 'c', '@', '@']
['1', '2', '3', 'ag', 'a', '@', '@']
['1', '2', '3', 'ag', 'n', '@', '@', '@']
['1', '2', '3', 'ag', 'c', '@', '@', '@']
['1', '2', '3', 'ag', 'a', '@', '@', '@']
['1', '2', '3', 'age', '@', '@', '@', '#']
['1', '2', '3', 'ag', 'n', '@', '@', '@', "'"]
['1', '2', '3', 'ag', 'n', '@', '@', '@', '#']
['1', '2', '3', 'ag', 'n', '@', '@', '@', '!']
['1', '2', '3', 'ag', 'n', '@', '@', '@', '$']
age@@@
agn@@@
agc@@@
aga@@@

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants