Is LLAVA chat template correct? #5489

mibejjh · 2024-09-20T00:02:05Z

Reminder

I have read the README and searched the existing issues.

System Info

llamafactory 0.9.0
python 3.11.9

Reproduction

export finetuned any llava based model from webui with llava template

Expected behavior

In tokenizer_config.json in exported model,
chat_template looks like below, but there is no Image related keyword at all. It looks like vicuna's template.

"chat_template": "{% set system_message = 'A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user\\'s questions.' %}{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% endif %}{% if system_message is defined %}{{ system_message }}{% endif %}{% for message in loop_messages %}{% set content = message['content'] %}{% if message['role'] == 'user' %}{{ 'USER: ' + content + ' ASSISTANT:' }}{% elif message['role'] == 'assistant' %}{{ content + '</s>' }}{% endif %}{% endfor %}",

When I checked chat_template at 76 line of src/llamafactory/train/tuner.py (export_model()),
there are tokenizer and processor, but they have different chat templates.

tokenizer.chat_template, which is loaded from get_template_and_fix_tokenizer(), is same as above
while processor.chat_template is totally different like below, And I believe processor's chat template could be correct.

"{% for message in messages %}{% if message['role'] != 'system' %}{{ message['role'].upper() + ': '}}{% endif %}{# Render all images first #}{% for content in message['content'] | selectattr('type', 'equalto', 'image') %}{{ '<image>\n' }}{% endfor %}{# Render all text next #}{% if message['role'] != 'assistant' %}{% for content in message['content'] | selectattr('type', 'equalto', 'text') %}{{ content['text'] + ' '}}{% endfor %}{% else %}{% for content in message['content'] | selectattr('type', 'equalto', 'text') %}{% generation %}{{ content['text'] + ' '}}{% endgeneration %}{% endfor %}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ 'ASSISTANT:' }}{% endif %}"

I'm not sure which chat template is used during trainning and which one is for inference.

Others

No response

The text was updated successfully, but these errors were encountered:

github-actions bot added the pending This problem is yet to be addressed label Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is LLAVA chat template correct? #5489

Is LLAVA chat template correct? #5489

mibejjh commented Sep 20, 2024 •

edited

Loading

Is LLAVA chat template correct? #5489

Is LLAVA chat template correct? #5489

Comments

mibejjh commented Sep 20, 2024 • edited Loading

Reminder

System Info

Reproduction

Expected behavior

Others

mibejjh commented Sep 20, 2024 •

edited

Loading