Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool use example #174

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DePasqualeOrg
Copy link
Contributor

@DePasqualeOrg DePasqualeOrg commented Jan 1, 2025

This demonstrates tool use (function calling), which is now supported in my PR to Swift Jinja. You should make sure that you have the latest tokenizer_config.json file for each model, since in some cases function calling was added in a recent update.

The following are some examples of responses to the prompt What's the weather in Paris today? in LLMEval. A get_current_weather function is provided to the model in the prompt constructed with the chat template.

Llama 3.1 8B

<|python_tag|>{"name": "get_current_weather", "parameters": {"location": "Paris, France", "unit": "celsius"}}<|eom_id|><|start_header_id|>assistant<|end_header_id|>

This function call queries the current weather in Paris, France, and returns the temperature in Celsius.

Llama 3.2 3B

<|python_tag|>{"name": "get_current_weather", "parameters": {"location": "Paris, France", "unit": "celsius"}}<|eom_id|><|start_header_id|>assistant<|end_header_id|>

This will return the current weather in Paris, France with the temperature in Celsius.

Qwen 2.5 7B

<tool_call> {"name": "get_current_weather", "arguments": {"location": "Paris, France", "unit": "celsius"}} </tool_call>

Mistral 7B

[TOOL_CALLS] [{"name": "get_current_weather", "arguments": {"location": "Paris, France"}}]

Performance

Llama 3.1 8B and 3.2 3B with the current chat templates from tokenizer_config.json tend to always respond with a function call, even when not appropriate. My proposed change to the chat template helps, but the models still sometimes respond with calls to non-existent functions. In general, the prompts provided in the Llama chat templates are far from optimal, and I think the models' performance could be further improved simply by using better prompts (for example, "Knowledge cutoff date" instead of "Cutting Knowledge Date").

Qwen 2.5 7B and Mistral 7B do a better job of calling functions only when appropriate.

Handling the function call

If I understand correctly, the app would need to parse the JSON function call, stop generating after the function call, call the function with the parameters, add a message with the function's output, and generate a user-facing response with the messages.

@davidkoski
Copy link
Collaborator

This looks really interesting!

@@ -13,10 +13,10 @@
{
"identity" : "jinja",
"kind" : "remoteSourceControl",
"location" : "https://github.com/maiqingqiang/Jinja",
"location" : "https://github.com/DePasqualeOrg/Jinja",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this change can be reverted as the PR for function calling is merged?

Copy link
Contributor Author

@DePasqualeOrg DePasqualeOrg Jan 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are still waiting on @pcuenca to merge huggingface/swift-transformers#151, and I'm hoping we can get that done very soon. I have been doing my best to move this forward and fill in the gaps in functionality on the Swift side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants