-
Notifications
You must be signed in to change notification settings - Fork 513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tool calling issues with vllm #224
Comments
Should work the same as ollama, see the code here - #112 (comment). Happy to consider adding vllm as another custom model, but it would need more people to want it before we do the work. |
See #239, that would mean we could add |
Ola! I was testing pydantic.ai alongside vLLM and llama.cpp server, which I think both fulfill the rules for adding a new model. I have looked at the existing Ollama code and I am not sure I understand the value of adding a new I got this same snippet working with both vLLM and llama.cpp server out of the box: # For example with llama.cpp:
docker run -v ./models:/models -p 8080:8080 \
ghcr.io/ggerganov/llama.cpp:server -m /models/smollm2-360m-instruct-q8_0.gguf from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
model = OpenAIModel(
"mymodel",
base_url="http://localhost:8080/v1",
api_key="foo"
)
agent = Agent(
model=model,
system_prompt='Be concise, reply with one sentence.',
)
result = agent.run_sync('Where does "hello world" come from?')
print(result.data) So, is it better to just send a small documentation patch? pd. I don't have a problem on contributing a new |
from pydantic_ai import Agent model = OpenAIModel( agent = Agent( result = agent.run_sync('hello') it's work |
|
Does not work with tool calling, excluding which there is no reason for enterprises/retail hosting their own quantized |
Obviously, it works for some basic use that doesn't require
error (
|
exactly the same error i receive. It simply does not work with vllm. |
I can confirm the sample code from @maziyarpanahi comment fails on my dev box with a "tool_choice must either be a named tool, auto, or none" error when run against a vLLM (v0.6.6.post1) instance of meta-llama/Llama-3.3-70B-Instruct. However, same code works fine against an Ollama instance (on same machine) of llama3.1:8b. |
Yes , I use vllm model, and get the same tool calling problems. |
I think it's related to the way vllm handles tool calling. https://docs.vllm.ai/en/latest/features/tool_calling.html Perhaps only a documentation item is required if we find a compatible option for the vllm server backend. |
Did you folks tried launching vllm with the arguments shown in that link? |
I have tried it. The error is about
But it doesn't help since
|
We could probably fix the tool choice issue by further expanding |
would love to help. |
The ModelSettings class uses a TypedDict as its underlying type, providing flexibility to add options beyond the predefined attributes. Before generating chat completions, the code now checks if tool_choice is set in ModelSettings and applies it accordingly. Fixes: pydantic#224 Signed-off-by: Sébastien Han <[email protected]>
FYI: this is a PR to follow: #825 |
Update: I think this should be fixed by #825. We aren't planning on adding a custom model for vllm, as we've removed the See more docs here: https://ai.pydantic.dev/models/#ollama |
Here is my workaround: https://gist.github.com/BeautyyuYanli/9e66513e665ebaf3a4658e61fb9c04ef |
Hi,
Is there a way where we can add the vLLM OpenAI compatibility support.
vllm OpenAI Support
So that anyone can use any llm model calls.
The text was updated successfully, but these errors were encountered: