Skip to content

Expose raw logit weights to the API if request_logits=true #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pathorn
Copy link

@pathorn pathorn commented Nov 28, 2023

What does this PR do?

Adds a new input boolean parameter return_logits. If true, each Generation will contain the base64 of the binary float32 array of logit weights inside the Token's details.

I have tested performance and this extra output data does not add any measurable performance overhead.

It would be useful to have a test application on top of this API to demonstrate its usefulness.

Copy link

@NikolaBorisov NikolaBorisov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we could have options to return top 100 or 1000 tokens in addition to all. Right now returning 32000 * 32 is 1MB

Comment on lines +188 to +189
/// Logit tokens
optional string logit_tokens = 3;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you have string, and bytes above. I think you want bytes?

@@ -159,6 +159,7 @@ async fn generate(
if req.0.parameters.return_full_text.unwrap_or(false) {
add_prompt = Some(req.0.inputs.clone());
}
// let return_logits = req.0.parameters.return_logits;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete?

@@ -521,6 +525,7 @@ fn send_responses(
text: generation.token_text,
logprob: generation.token_logprob,
special: generation.token_is_special,
logits: if let Some(binfloats) = generation.logit_binary { Some(general_purpose::URL_SAFE.encode(binfloats)) } else { None },

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not inline this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants