Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Support JSON Schema from llama.cpp #798

Open
m0nsky opened this issue Jun 19, 2024 · 9 comments
Open

[Feature]: Support JSON Schema from llama.cpp #798

m0nsky opened this issue Jun 19, 2024 · 9 comments

Comments

@m0nsky
Copy link
Contributor

m0nsky commented Jun 19, 2024

Background & Description

I was working on a C# class -> GBNF converter, and came across this:

https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md#json-schemas--gbnf

Here's an example, thanks to this reddit post.

Example JSON:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "reply": {"type": "string"},
    "mood": {"type": "string"},
    "intent": {"type": "string"}
  },
  "required": ["reply", "mood", "intent"]
}

Example output:

 User: I gave you a cookie.

Please respond in JSON format with your reply, mood, and intent:
{"reply": "Thank you for the cookie! It's delicious!", "mood": "Happy", "intent": "Express gratitude" }

API & Usage

No response

How to implement

No response

@martindevans
Copy link
Member

Related: #309 ?

@m0nsky
Copy link
Contributor Author

m0nsky commented Jun 19, 2024

Yes somewhat, that PR sparked my initial interest in developing the C# class -> GBNF converter. However, the JSON Schema functionality seems to be already implemented on main & server on the llama.cpp side by passing the -j parameter, so it wouldn't make sense to reinvent the wheel.

@RobertRappe
Copy link

As someone slowly working on my own C# agentic system and has been exploring using JSON to format communication with the LLM, getting access to the llama.cpp JSON and grammar tools through LLamaSharp would be really useful.👍

@phil-scott-78
Copy link
Contributor

I was working on something with gbnf grammar and c#, but the perf just isn't there. This PR that's soon to be merged is very interesting. Hopefully it'll sneak into the next release, and if so I'll rework my generator to support Lark too

ggerganov/llama.cpp#10224

@mmoskal
Copy link

mmoskal commented Jan 30, 2025

@phil-scott-78 for perf with LLGuidance make sure to read up on lexer ; when you have something working, ping me, we can link your client lib from llguidance repo!

@phil-scott-78
Copy link
Contributor

I will certainly reach out. I appreciate that. Looking forward to seeing it merged in and being able to kick the tires

@phil-scott-78
Copy link
Contributor

@mmoskal - I will certainly reach out. I appreciate that. Looking forward to seeing it merged in and being able to kick the tires

Let me ask you a theoretical - the problem we are trying to solve here is json formatting. But that's just the data format we are asking the LLM to generate before deserializing into a .net object

So what if json format wasn't a given? Is there a format that might make better sense than not just llguidance would thrive but the models also could adhere too. Is that a route also worth experimenting?

@mmoskal
Copy link

mmoskal commented Jan 31, 2025

@phil-scott-78 TLDR: in theory yes, in practice stick to JSON.

Constrained decoding works best if model most of the time wants to output something compliant with the constraint. If you constraint the output to be an identifier but the model wants to write a Haiku will get something like silentWavesCrestingUnderMoonlitWinterSky; or else sure_I_can_help_you_here_is_your_haiku. Current models have seen lots of JSON (probably most out of structured data formats, though in some cases Python or JavaScript code might be close second). They were also often fine-tuned to follow a JSON Schema if given in prompt. Thus, unless you're really into training models, I would stick to JSON :)

@phil-scott-78
Copy link
Contributor

@mmoskal that's where my head was at, models themselves are gonna wanna drive and use the grammar to keep them between the lines. You mention giving them json schemas, I've had a bunch of luck giving even the smallest of models (like 0.5B with Q4) a sample JSON with templates and a grammar to guide, which is what I was trying to solve with my project.

Anyways, I saw the llama.cpp merge so I gave it a go updating my project to output, hopefully, compatible syntaxes. It at least matches the output from your gbnf-to-lark scripts. I was a bit disappointed to see the release bits didn't have the feature enabled, and I'm not gonna lie, I'm afraid I'm not clever enough build it on my windows box. So I'll hold tight until it makes it to llama.cpp and then makes its way here, hopefully.

But progress!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants