Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool calling issues with vllm #224

Open
Navanit-git opened this issue Dec 12, 2024 · 20 comments
Open

Tool calling issues with vllm #224

Navanit-git opened this issue Dec 12, 2024 · 20 comments

Comments

@Navanit-git
Copy link
Contributor

Navanit-git commented Dec 12, 2024

Hi,
Is there a way where we can add the vLLM OpenAI compatibility support.
vllm OpenAI Support

So that anyone can use any llm model calls.

@samuelcolvin
Copy link
Member

Should work the same as ollama, see the code here - #112 (comment).

Happy to consider adding vllm as another custom model, but it would need more people to want it before we do the work.

@samuelcolvin
Copy link
Member

See #239, that would mean we could add VLLMModel.

@daavoo
Copy link

daavoo commented Dec 18, 2024

See #239, that would mean we could add VLLMModel.

Ola! I was testing pydantic.ai alongside vLLM and llama.cpp server, which I think both fulfill the rules for adding a new model.

I have looked at the existing Ollama code and I am not sure I understand the value of adding a new VLLMModel.
Looks like there is no custom logic (beyond providing a default api_key value) and provides an ~arbitrary hardcoded list of model names (which don't cover all available models)

I got this same snippet working with both vLLM and llama.cpp server out of the box:

# For example with llama.cpp:
docker run -v ./models:/models -p 8080:8080 \
ghcr.io/ggerganov/llama.cpp:server -m /models/smollm2-360m-instruct-q8_0.gguf
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
    "mymodel", 
    base_url="http://localhost:8080/v1", 
    api_key="foo"
)

agent = Agent(  
    model=model,
    system_prompt='Be concise, reply with one sentence.',  
)

result = agent.run_sync('Where does "hello world" come from?')  
print(result.data)

So, is it better to just send a small documentation patch?

pd. I don't have a problem on contributing a new VLLM/LLAMACPP model myself, just wondering if it makes sense to keep adding those

@sadransh
Copy link

@daavoo This wouldnt work with tool calling and non str result type. you can re-use the example from here: to double check #398

@lys791227
Copy link

@daavoo This wouldnt work with tool calling and non str result type. you can re-use the example from here: to double check #398

I also same.

@guang11644331
Copy link

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
model_name="qwen25",
base_url="http://192.168.xx.xx:8090/v1",
api_key="1"
)

agent = Agent(
model=model,
)

result = agent.run_sync('hello')
print(result.data)

it's work

@lys791227
Copy link

from pydantic_ai import Agent from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel( model_name="qwen25", base_url="http://192.168.xx.xx:8090/v1", api_key="1" )

agent = Agent( model=model, )

result = agent.run_sync('hello') print(result.data)

it's work
Do you call tool try?should not !

@iabgcb
Copy link

iabgcb commented Jan 16, 2025

Does not work with tool calling, excluding which there is no reason for enterprises/retail hosting their own quantized
models to use this framework. so please VLLM compatibility ASAP

@maziyarpanahi
Copy link

Obviously, it works for some basic use that doesn't require tool selection, but even a simple code like this fails with vLLM:

from pydantic import BaseModel

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel

model = OpenAIModel(
   model_name="qwen25",
   base_url="http://192.168.xx.xx:8090/v1",
   api_key="1"
)

class CityLocation(BaseModel):
    city: str
    country: str

agent = Agent(model=model, result_type=CityLocation)

result = agent.run_sync('Where were the olympics held in 2012?')
print(result.data)
print(result.usage())

error (tool_choice must either be a named tool, "auto", or "none")

openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': '[{\'type\': \'value_error\', \'loc\': (\'body\',), \'msg\': \'Value error, `tool_choice` must either be a named tool, "auto", or "none".\', \'input\': {\'messages\': [{\'role\': \'user\', \'content\': \'Where were the olympics held in 2012?\'}], \'model\': \'arcee-train/arcee-virtuoso-maz-10b-v2\', \'n\': 1, \'parallel_tool_calls\': True, \'stream\': False, \'tool_choice\': \'required\', \'tools\': [{\'type\': \'function\', \'function\': {\'name\': \'final_result\', \'description\': \'The final response which ends this conversation\', \'parameters\': {\'properties\': {\'city\': {\'title\': \'City\', \'type\': \'string\'}, \'country\': {\'title\': \'Country\', \'type\': \'string\'}}, \'required\': [\'city\', \'country\'], \'title\': \'CityLocation\', \'type\': \'object\'}}}]}, \'ctx\': {\'error\': ValueError(\'`tool_choice` must either be a named tool, "auto", or "none".\')}}]', 'type': 'BadRequestError', 'param': None, 'code': 400}

@iabgcb
Copy link

iabgcb commented Jan 17, 2025

exactly the same error i receive. It simply does not work with vllm.

@nihilimbo
Copy link

I can confirm the sample code from @maziyarpanahi comment fails on my dev box with a "tool_choice must either be a named tool, auto, or none" error when run against a vLLM (v0.6.6.post1) instance of meta-llama/Llama-3.3-70B-Instruct.

However, same code works fine against an Ollama instance (on same machine) of llama3.1:8b.

@myyang19770915
Copy link

Yes , I use vllm model, and get the same tool calling problems.
please add this work in this project !
thanks

@TheCodingLand
Copy link

I think it's related to the way vllm handles tool calling.

https://docs.vllm.ai/en/latest/features/tool_calling.html

Perhaps only a documentation item is required if we find a compatible option for the vllm server backend.

@daavoo
Copy link

daavoo commented Jan 20, 2025

I think it's related to the way vllm handles tool calling.

https://docs.vllm.ai/en/latest/features/tool_calling.html

Perhaps only a documentation item is required if we find a compatible option for the vllm server backend.

Did you folks tried launching vllm with the arguments shown in that link?

@maziyarpanahi
Copy link

I have tried it. The error is about 'tool_choice': 'required'. What you see in vLLM is to activate tool_choice auto. You can add these 2 to enable this in vLLM:

    --enable-auto-tool-choice \
    --tool-call-parser hermes \

But it doesn't help since pydantic-ai is asking for required when it comes to tool_choice:

`tool_choice` must either be a named tool, "auto", or "none"

@sydney-runkle
Copy link
Member

We could probably fix the tool choice issue by further expanding ModelSettings.

@iabgcb
Copy link

iabgcb commented Jan 24, 2025

would love to help.

@sydney-runkle sydney-runkle marked this as a duplicate of #728 Jan 24, 2025
leseb added a commit to leseb/pydantic-ai that referenced this issue Jan 27, 2025
The ModelSettings class uses a TypedDict as its underlying type, providing
flexibility to add options beyond the predefined attributes.

Before generating chat completions, the code now checks if tool_choice is set
in ModelSettings and applies it accordingly.

Fixes: pydantic#224
Signed-off-by: Sébastien Han <[email protected]>
@maziyarpanahi
Copy link

FYI: this is a PR to follow: #825

@sydney-runkle sydney-runkle marked this as a duplicate of #927 Feb 14, 2025
@sydney-runkle sydney-runkle changed the title Add vllm custom model support for OpenAI compatibility Tool calling issues with vllm Feb 14, 2025
@sydney-runkle
Copy link
Member

Update: I think this should be fixed by #825. We aren't planning on adding a custom model for vllm, as we've removed the OllamaModel in favor of recommending usage with OpenAIModel as the two are compatible.

See more docs here: https://ai.pydantic.dev/models/#ollama

@BeautyyuYanli
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.