We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I used 3.0.2 official docker to load a local Llama 3 instruct model
I used 3.0.2 official docker to load a local Llama 3 instruct model, and used InferenceClient to call it (see some interaction here)
output = client.text_generation("Today is a ", max_new_tokens=2, do_sample=True, temperature=1.0, details=True, decoder_input_details=True)
The output is this. Prefill is empty.
TextGenerationOutput(generated_text='5-minute', details=TextGenerationOutputDetails(finish_reason='length', generated_tokens=2, prefill=[], tokens=[TextGenerationOutputToken(id=20, logprob=-2.1425781, special=False, text='5'), TextGenerationOutputToken(id=24401, logprob=-4.4609375, special=False, text='-minute')], best_of_sequences=None, seed=9305067545921572115, top_tokens=None))
I expect prefill to include tokens in the prompt as well as their logprobs, as shown in the doc here.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
System Info
I used 3.0.2 official docker to load a local Llama 3 instruct model
Information
Tasks
Reproduction
I used 3.0.2 official docker to load a local Llama 3 instruct model, and used InferenceClient to call it (see some interaction here)
The output is this. Prefill is empty.
Expected behavior
I expect prefill to include tokens in the prompt as well as their logprobs, as shown in the doc here.
The text was updated successfully, but these errors were encountered: