Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: response error with stream tool call #1143

Open
4 tasks done
ATAKing1023 opened this issue Dec 24, 2024 · 2 comments
Open
4 tasks done

[Bug]: response error with stream tool call #1143

ATAKing1023 opened this issue Dec 24, 2024 · 2 comments
Assignees
Labels

Comments

@ATAKing1023
Copy link

Model Series

Qwen2.5

What are the models used?

Qwen2.5-7B-Instruct

What is the scenario where the problem happened?

deployment with vllm, tool calling with stream=True

Is this a known issue?

  • I have followed the GitHub README.
  • I have checked the Qwen documentation and cannot find an answer there.
  • I have checked the documentation of the related framework and cannot find useful information.
  • I have searched the issues and there is not a similar one.

Information about environment

OS: Ubuntu 20.04
Python: Python 3.11.5
GPUs: 2 x NVIDIA A20
NVIDIA driver: 525.85.12
CUDA compiler: 12.0
PyTorch: 2.5.1+cu124

Log output

INFO 12-24 09:27:01 logger.py:37] Received request chatcmpl-f112a9a6b666404690062949fd68b099: prompt: '', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.7, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=3000, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: None, lora_request: None, prompt_adapter_request: None.
INFO:     10.1.100.234:55080 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO 12-24 09:27:01 engine.py:267] Added request chatcmpl-f112a9a6b666404690062949fd68b099.
ERROR 12-24 09:27:02 hermes_tool_parser.py:338] Error trying to handle streaming tool call.
ERROR 12-24 09:27:02 hermes_tool_parser.py:338] Traceback (most recent call last):
ERROR 12-24 09:27:02 hermes_tool_parser.py:338]   File "/opt/miniconda3/lib/python3.11/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 227, in extract_tool_calls_streaming
ERROR 12-24 09:27:02 hermes_tool_parser.py:338]     function_name: Union[str, None] = current_tool_call.get("name")
ERROR 12-24 09:27:02 hermes_tool_parser.py:338]                                       ^^^^^^^^^^^^^^^^^^^^^
ERROR 12-24 09:27:02 hermes_tool_parser.py:338] AttributeError: 'NoneType' object has no attribute 'get'
ERROR 12-24 09:27:02 hermes_tool_parser.py:338] Error trying to handle streaming tool call.
ERROR 12-24 09:27:02 hermes_tool_parser.py:338] Traceback (most recent call last):
ERROR 12-24 09:27:02 hermes_tool_parser.py:338]   File "/opt/miniconda3/lib/python3.11/site-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 218, in extract_tool_calls_streaming
ERROR 12-24 09:27:02 hermes_tool_parser.py:338]     flags) if tool_call_portion else None
ERROR 12-24 09:27:02 hermes_tool_parser.py:338]               ^^^^^^^^^^^^^^^^^
ERROR 12-24 09:27:02 hermes_tool_parser.py:338] UnboundLocalError: cannot access local variable 'tool_call_portion' where it is not associated with a value

Description

Steps to reproduce

http request payload
{
    "messages": [
        {
            "content": "你是一个设计用于与 SQL 数据库交互的智能代理。",
            "role": "system"
        },
        {
            "content": "XXXX",
            "role": "user"
        }
    ],
    "model": "Qwen2.5-7B-Instruct",
    "max_completion_tokens": 3000,
    "n": 1,
    "stream": true,
    "temperature": 0.7,
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "sql_db_list_tables",
                "description": "输入为空字符串,输出为数据库中表名的逗号分隔列表。",
                "parameters": {
                    "properties": {
                        "tool_input": {
                            "default": "",
                            "description": "An empty string",
                            "type": "string"
                        }
                    },
                    "type": "object"
                }
            }
        }
    ]
}
generated prompt
<|im_start|>system
你是一个设计用于与 SQL 数据库交互的智能代理。

# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "sql_db_list_tables", "description": "输入为空字符串,输出为数据库中表名的逗号分隔列表。", "parameters": {"properties": {"tool_input": {"default": "", "description": "An empty string", "type": "string"}}, "type": "object"}}}</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call><|im_end|>
<|im_start|>user
XXXX<|im_end|>
<|im_start|>assistant
generated tool call (extracted from response text stream)
{
    "tool_calls": [
        {
            "id": "chatcmpl-tool-04bf6c5c96c04a2594aee0ba5f0fabf7",
            "type": "function",
            "index": 0,
            "function": {
                "name": "sql_db_list_tables"
            }
        }
    ]
}

Expected results (tool call portion)

Per tool definition, the result should be:

                "tool_calls": [
                    {
                        "id": "chatcmpl-tool-01241dc2c48645b28205a092cf4aaf8e",
                        "type": "function",
                        "function": {
                            "name": "sql_db_list_tables",
                            "arguments": {"tool_input": ""}
                        }
                    }
                ]

With stream=False, the model will return:

                "tool_calls": [
                    {
                        "id": "chatcmpl-tool-01241dc2c48645b28205a092cf4aaf8e",
                        "type": "function",
                        "function": {
                            "name": "sql_db_list_tables",
                            "arguments": "{}"
                        }
                    }
                ]

the response is OK to make use of, but not exactly as definition.

@ATAKing1023
Copy link
Author

I found that the model generated the correct tokens, which is {"name": "sql_db_list_tables", "arguments": {}}.
this issue is related to vllm: vllm-project/vllm#11392

Copy link

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants