First readme.md example fails #465

jloganolson · 2023-12-16T15:50:52Z

jloganolson
Dec 16, 2023

Describe the issue as clearly as possible:

If you change the one-line review to "This restaurant stinks" - you still get a "Positive" answer back.

Steps/code to reproduce the bug:

import outlines
model_path = "../Mistral-7B-v0.1/"
model = outlines.models.transformers(model_path)

prompt = """You are a sentiment-labelling assistant.
Is the following review positive or negative?

Review: This restaurant is just awesome!
"""
answer = outlines.generate.choice(model, ["Positive", "Negative"])(prompt)
print(answer) #returns Positive - good

prompt = """You are a sentiment-labelling assistant.
Is the following review positive or negative?

Review: This restaurant stinks.
"""
answer = outlines.generate.choice(model, ["Positive", "Negative"])(prompt)
print(answer) #returns Positive - bad

Expected result:

Second answer should print negative

Error message:

No response

Outlines/Python version information:

outlines.__version__ fails but pip identifies it as 0.0.16

Context for the issue:

No response

rlouf · 2023-12-16T16:02:19Z

rlouf
Dec 16, 2023
Maintainer

Doesn't seem to be an issue with Outlines. What's the output without guided generation?

0 replies

jloganolson · 2023-12-17T14:32:41Z

jloganolson
Dec 17, 2023
Author

TLDR from long answer below - outlines choice doesn't seem to always match logprobs?

The unguided output’s starts with “The“ but maybe more relevant is the logprobs of “Positive” vs “Negative” with the example prompt - they were inverted with the positive-seeming review getting a higher probability for “negative” being the next token. I fixed that with prompt engineering (just had to include some few-shots) but given the misalignment between logprobs and outline output, I’m still wondering what’s going on…

My assumption would be the logprobs should consistently reflect Outline’s choice but maybe there’s a temperature setting in Outlines not set to 0 or just my whole mental model is incorrect? Also, is there a way to have the logprobs print in this situation for debugging purposes?

As for the prompt itself, I imagine others will use the first example as jumping off point like me, so I would suggest updating to something with choices that have a disparity in probability and align with a reasonable person’s expectations, e.g. “My dog is named [Fido,Steve]”.

0 replies

jloganolson · 2023-12-17T14:33:15Z

jloganolson
Dec 17, 2023
Author

(Accidentally closed when responding above)

0 replies

rlouf · 2023-12-17T18:03:37Z

rlouf
Dec 17, 2023
Maintainer

That's probably an artifact of multinomial sampling, and if you took enough samples "Negative" would be more represented. I assume in this case greedy sampling would give the right answer.

3 replies

rlouf Dec 22, 2023
Maintainer

Try with greedy sampling: import from outlines.generate.samplers import greedy and then pass sampler=greedy when initializing the generator. It will give you the answer with the highest probability.

jloganolson Dec 22, 2023
Author

Thank you for following up on this! I'll try it out when I'm back after the holidays.

rlouf Jan 4, 2024
Maintainer

This is related to #486. For choice we may want to return the probability distribution rather than sample a result.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First readme.md example fails #465

{{title}}

Replies: 4 comments 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

First readme.md example fails #465

jloganolson Dec 16, 2023

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

Replies: 4 comments · 3 replies

rlouf Dec 16, 2023 Maintainer

jloganolson Dec 17, 2023 Author

jloganolson Dec 17, 2023 Author

rlouf Dec 17, 2023 Maintainer

rlouf Dec 22, 2023 Maintainer

jloganolson Dec 22, 2023 Author

rlouf Jan 4, 2024 Maintainer

jloganolson
Dec 16, 2023

Replies: 4 comments 3 replies

rlouf
Dec 16, 2023
Maintainer

jloganolson
Dec 17, 2023
Author

jloganolson
Dec 17, 2023
Author

rlouf
Dec 17, 2023
Maintainer

rlouf Dec 22, 2023
Maintainer

jloganolson Dec 22, 2023
Author

rlouf Jan 4, 2024
Maintainer