Prompt Logprobs via echo=True #82
-
I'm looking for an inference server that can generate the Logprobs for the input prompt. For older OpenAI models, such as davinci, this was possible by querying the server with Logprobs and echo=True. It was since depreciated, however I believe this is an important capability of LLMs. This feature is available via vllm but not via llama.cpp, however, I'm looking for something that runs on MacOS. Does this feature exist in optillm? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
@ciaran-regan-ie logprobs are still available for newer OpenAI models as well (gpt-4o-mini, and gpt-4o) just set
Are you looking for some solution like vllm that can give you these for say any model loaded from HuggingFace? It is actually supported in llama.cpp using the Also for MacOS, it is supported in the mlx-server (see - https://github.com/ml-explore/mlx-examples/blob/8fe9539af76075405b2c3071ba9657aa921d749d/llms/mlx_lm/SERVER.md#request-fields). |
Beta Was this translation helpful? Give feedback.
-
Logprobs are now directly supported in optillm with the new #90 v0.0.10 release. response = client.chat.completions.create(
model = "meta-llama/Llama-3.2-1B-Instruct",
messages=messages,
temperature=0.2,
logprobs = True,
top_logprobs = 3, |
Beta Was this translation helpful? Give feedback.
@codelion I believe this solves it! Thank you!