-
Notifications
You must be signed in to change notification settings - Fork 260
Support tool use and add example #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8c484a2
to
84e2f64
Compare
This looks really interesting! |
mlx-swift-examples.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved
Outdated
Show resolved
Hide resolved
3959ba1
to
be440a8
Compare
This is almost ready, but some more changes in swift-transformers might be necessary: huggingface/swift-transformers#169 |
aa1afcc
to
559a44c
Compare
559a44c
to
50c3529
Compare
This is now ready for review. I've added the |
@@ -126,8 +130,6 @@ struct ContentView: View { | |||
|
|||
} | |||
.task { | |||
self.prompt = llm.modelConfiguration.defaultPrompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be put back once the prompt override is removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above you had hard coded the prompt for development -- once that part is reverted this should go back in (the per model prompt)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've reverted that part.
Toggle(isOn: $llm.includeWeatherTool) { | ||
Text("Include \"get current weather\" tool") | ||
} | ||
.frame(maxWidth: 350, alignment: .leading) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great way to see the tool integration in action. I wonder if it should be included in this generic UI? I guess it is an example and people who use this as the basis for their own app can either include it or not, depending on their needs.
// let modelConfiguration = ModelRegistry.llama3_1_8B_4bit | ||
// let modelConfiguration = ModelRegistry.mistral7B4bit | ||
// let modelConfiguration = ModelRegistry.qwen2_5_7b | ||
let modelConfiguration = ModelRegistry.qwen2_5_1_5b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing this to qwen is fine (it comes in at 1.7G) but please update the comment and remove the commented out parts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
"required": ["location"], | ||
] as [String: any Sendable], | ||
] as [String: any Sendable], | ||
] as [String: any Sendable] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty unwieldy :-) I see it matches the type in Transformers -- it is too bad we don't have a struct
that could be used to declare this and convert to that type. Perhaps an exercise for future refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at using a struct, but it seems even more unwieldy, although it would be more type-safe. Perhaps someone else can find a more elegant solution than me.
static public let qwen205b4bit = ModelConfiguration( | ||
id: "mlx-community/Qwen1.5-0.5B-Chat-4bit", | ||
overrideTokenizer: "PreTrainedTokenizer", | ||
defaultPrompt: "why is the sky blue?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be removed? It is older, but people might still refer to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added it back, although I think it's fine to remove outdated models, since the pattern is clear from the existing examples.
@DePasqualeOrg this looks really cool! I added a few comments on removing some of the debug/dev code. It looks like there is a conflict in UserInput.swift now (I think from the merge of the video processing for VLMs). |
4c64908
to
a4d206e
Compare
Thanks! I resolved the merge conflict and made the changes you suggested. |
a4d206e
to
d0f57b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work, thank you!
This demonstrates tool use (function calling), which is now supported in my PR to Swift Jinja. You should make sure that you have the latest
tokenizer_config.json
file for each model, since in some cases function calling was added in a recent update.The following are some examples of responses to the prompt
What's the weather in Paris today?
in LLMEval. Aget_current_weather
function is provided to the model in the prompt constructed with the chat template.Llama 3.1 8B
Llama 3.2 3B
Qwen 2.5 7B
Mistral 7B
Performance
Llama 3.1 8B and 3.2 3B with the current chat templates from
tokenizer_config.json
tend to always respond with a function call, even when not appropriate. My proposed change to the chat template helps, but the models still sometimes respond with calls to non-existent functions. In general, the prompts provided in the Llama chat templates are far from optimal, and I think the models' performance could be further improved simply by using better prompts (for example, "Knowledge cutoff date" instead of "Cutting Knowledge Date").Qwen 2.5 7B and Mistral 7B do a better job of calling functions only when appropriate.
Handling the function call
If I understand correctly, the app would need to parse the JSON function call, stop generating after the function call, call the function with the parameters, add a message with the function's output, and generate a user-facing response with the messages.