-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System role problem running Gemma 2 on vLLM #1386
Comments
Same issue :'( |
+1 |
Kinda hacky but you can change But I think it would be nice to be able to omit the system message from chat-ui side. Looks like the relevant code is here: chat-ui/src/routes/conversation/+server.ts Lines 46 to 56 in 07c9892
|
Opened an issue for a potential solution, feel free to tackle it if you want! 😄 #1432 |
Issue should be solved, try adding |
Hello,
In running chat ui and trying some models, with phi3 and llama i had no problem but when I run gemma2 in vllm Im not able to make any good api request,
in env.local:
{
"name": "google/gemma-2-2b-it",
"id": "google/gemma-2-2b-it",
"chatPromptTemplate": "{{#each messages}}{{#ifUser}}<start_of_turn>user\n{{#if @FIRST}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}}<end_of_turn>\n<start_of_turn>model\n{{/ifUser}}{{#ifAssistant}}{{content}}<end_of_turn>\n{{/ifAssistant}}{{/each}}",
"parameters": {
"temperature": 0.1,
"top_p": 0.95,
"repetition_penalty": 1.2,
"top_k": 50,
"truncate": 1000,
"max_new_tokens": 2048,
"stop": ["<end_of_turn>"]
},
"endpoints": [
{
"type": "openai",
"baseURL": "http://127.0.0.1:8000/v1",
}
and I always have the same response in vllm server:
ERROR 08-05 12:39:06 serving_chat.py:118] Error in applying chat template from request: System role not supported
INFO: 127.0.0.1:42142 - "POST /v1/chat/completions HTTP/1.1" 400 Bad Request
do someone know if I have to change and how do change the chat template or deactivate system role ? is it a vllm problem or a chat ui problem?
Thank U!
The text was updated successfully, but these errors were encountered: