Can I use onnxruntime-genai with a grammar for output like llama.cpp? #564
han-minhee
started this conversation in
Ideas
Replies: 2 comments 1 reply
-
Hi @han-minhee. This is something we plan to support in the future. But we do not have support for this right now. We are currently in the process of making a plan for new features to support in the 0.4.0 release and I will add this to the pool of new features so it can be considered for 0.4.0. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I'll move this to discussions now. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
llama.cpp can force the model to generate outputs that satisfy the given grammar format.
(which, in my understanding, is actually picking the token with the highest logit among the tokens that satisfy the grammar)
Is something like this possible with the current C or C# API?
I couldn't find the corresponding API in the documentation.
Beta Was this translation helpful? Give feedback.
All reactions