Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prerelease 0.2.0rc1 no longer returns log probs with llama.cpp #1064

Open
oj-sec opened this issue Oct 26, 2024 · 1 comment
Open

Prerelease 0.2.0rc1 no longer returns log probs with llama.cpp #1064

oj-sec opened this issue Oct 26, 2024 · 1 comment

Comments

@oj-sec
Copy link

oj-sec commented Oct 26, 2024

The bug
Updating from guidance==0.1.16 to prerelease guidance==0.2.0rc1 causes model.log_prob() to return 0 rather than the true log probs for a generation when using the llama.cpp backend. I have tested GGUF quants of models based on Llama, Mistral and Gemma and observed this behaviour to be model agnostic.

To Reproduce
Reproduction Colab notebook here - involves uninstalling and reinstalling Guidance, but the change in output between installs is:

# With guidance==0.1.16
from guidance import models, gen, select
import math

llm = models.LlamaCpp(f"./models/{model}", n_gpu_layers=40, n_ctx=2000, compute_log_probs=True)
output = llm + "You flip a coin. The result is: " + gen(name="coinflip", regex="(heads|tails)")
logprobs = output.log_prob("coinflip")
prob = round(math.exp(logprobs), 5)
print(f"Output:{output['coinflip']}\nLP: {logprobs}\nP: {prob}")
You flip a coin. The result is: heads
Output:heads
LP: -1.1534752799652015
P: 0.31554
# With guidance==0.2.0rc1
from guidance import models, gen, select
import math

llm = models.LlamaCpp(f"./models/{model}", n_gpu_layers=40, n_ctx=2000, compute_log_probs=True)
output = llm + "You flip a coin. The result is: " + gen(name="coinflip", regex="(heads|tails)")
logprobs = output.log_prob("coinflip")
prob = round(math.exp(logprobs), 5)
print(f"Output:{output['coinflip']}\nLP: {logprobs}\nP: {prob}")
You flip a coin. The result is: heads
Output:heads
LP: 0.0
P: 1.0

System info (please complete the following information):

If I can provide any further info please let me know. Huge thanks for this amazing library.

@hudson-ai
Copy link
Collaborator

Hi @oj-sec thanks for bringing this up, and our apologies if it's impacting your workflow. The new parser that we're using in 0.2.0rc1 is considerably faster than what was running in previous versions, but it currently has a few limitations that we need to continue working on (probability outputs is probably the primary one). So, it's on our radar.

Thank you for submitting the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants