[Feature]: support LlavaForConditionalGeneration with turbomind inference #2710

deepindeed2022 · 2024-11-05T02:33:33Z

Motivation

在main分支上支持 llava_interleave_qwen2_7b_hf 的turbomind推理
修复多模态模型在tune 的bug

Modification

适配模型加载
添加测试example
生成 gemm_config.ini 脚本的适配

Use cases (Optional)

lmdeploy serve api_server llava_hf/llava_interleave_qwen2_7b_hf

lvhan028 · 2024-11-05T03:27:55Z

Please resolve the linting error by :

pip install pre-commit
cd lmdeploy # the root directory of lmdeploy repo
pre-commit install
pre-commit run --all-files

examples/python/README.md

examples/python/offline_vl.py

docs/en/multi_modal/llava_qwen.md

lmdeploy/turbomind/deploy/source_model/llava_qwen2.py

lmdeploy/turbomind/supported_models.py

AllentDan

Please resolve the conflicts

lmdeploy/turbomind/generate_gemm_config.py

…o feature/llava_qwen2

lmdeploy/turbomind/generate_gemm_config.py

docs/en/multi_modal/llava.md

lmdeploy/turbomind/supported_models.py

lmdeploy/turbomind/deploy/source_model/llava.py

- fix tune attribute error - add chinese llava doc

lmdeploy/turbomind/deploy/source_model/llava.py

irexyc · 2024-11-08T06:29:48Z

从config.json 里面读参数可能不是特别好，尤其是对于这种融合了多个模型的结构，很多参数都省略了，强行加默认值会增加出错的风险，之后可以另外提交PR采用更好的方式。

AllentDan

Tested OK with llava-interleave-qwen-7b-hf

lvhan028 · 2024-11-08T10:17:06Z

pr_ete_test flow is being rerun. It suffered OOM issue somehow.

…ence (InternLM#2710) * feat: support llava_qwen2 for fp16 and awq * update generate gemm config script for VLM * lint: fix lint warning * doc: presenting the usage in the user guide * resolve conflict issue and refactor for better design * fix and doc: - fix tune attribute error - add chinese llava doc * keep LlavaLlamaForCausalLM/LlavaMistralForCausalLM to llama * fix attn_bias default value

deepindeed2022 added 2 commits November 4, 2024 10:35

feat: support llava_qwen2 for fp16 and awq

5a212d0

update generate gemm config script for VLM

eea842c

lvhan028 requested review from AllentDan and irexyc November 5, 2024 03:28

lvhan028 added the enhancement New feature or request label Nov 5, 2024

lint: fix lint warning

20c7476

deepindeed2022 force-pushed the feature/llava_qwen2 branch from bf61682 to 20c7476 Compare November 5, 2024 07:03

lvhan028 reviewed Nov 5, 2024

View reviewed changes

examples/python/README.md Outdated Show resolved Hide resolved

lvhan028 reviewed Nov 5, 2024

View reviewed changes

examples/python/offline_vl.py Outdated Show resolved Hide resolved

doc: presenting the usage in the user guide

2631814

irexyc reviewed Nov 6, 2024

View reviewed changes

docs/en/multi_modal/llava_qwen.md Outdated Show resolved Hide resolved

docs/en/multi_modal/llava_qwen.md Outdated Show resolved Hide resolved

irexyc reviewed Nov 6, 2024

View reviewed changes

lmdeploy/turbomind/deploy/source_model/llava_qwen2.py Outdated Show resolved Hide resolved

irexyc reviewed Nov 6, 2024

View reviewed changes

lmdeploy/turbomind/supported_models.py Outdated Show resolved Hide resolved

AllentDan reviewed Nov 6, 2024

View reviewed changes

lmdeploy/turbomind/generate_gemm_config.py Outdated Show resolved Hide resolved

deepindeed2022 added 2 commits November 7, 2024 01:21

Merge branch 'main' of https://github.com/deepindeed2022/lmdeploy int…

eb9e272

…o feature/llava_qwen2

resolve conflict issue and refactor for better design

6323319

AllentDan reviewed Nov 7, 2024

View reviewed changes

lmdeploy/turbomind/generate_gemm_config.py Outdated Show resolved Hide resolved

AllentDan reviewed Nov 7, 2024

View reviewed changes

docs/en/multi_modal/llava.md Show resolved Hide resolved

irexyc reviewed Nov 7, 2024

View reviewed changes

lmdeploy/turbomind/supported_models.py Outdated Show resolved Hide resolved

lmdeploy/turbomind/supported_models.py Outdated Show resolved Hide resolved

irexyc reviewed Nov 7, 2024

View reviewed changes

lmdeploy/turbomind/deploy/source_model/llava.py Outdated Show resolved Hide resolved

deepindeed2022 added 2 commits November 8, 2024 02:52

fix and doc:

e5005f5

- fix tune attribute error - add chinese llava doc

keep LlavaLlamaForCausalLM/LlavaMistralForCausalLM to llama

48d1a5c

deepindeed2022 force-pushed the feature/llava_qwen2 branch from 3383c6e to 48d1a5c Compare November 8, 2024 03:17

deepindeed2022 changed the title ~~[Feature]: support llava qwen2 with turbomind inference~~ [Feature]: support LlavaForConditionalGeneration with turbomind inference Nov 8, 2024

irexyc reviewed Nov 8, 2024

View reviewed changes

lmdeploy/turbomind/deploy/source_model/llava.py Outdated Show resolved Hide resolved

lmdeploy/turbomind/deploy/source_model/llava.py Outdated Show resolved Hide resolved

fix attn_bias default value

4c55c8d

irexyc approved these changes Nov 8, 2024

View reviewed changes

AllentDan approved these changes Nov 8, 2024

View reviewed changes

lvhan028 merged commit 78ab485 into InternLM:main Nov 8, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: support LlavaForConditionalGeneration with turbomind inference #2710

[Feature]: support LlavaForConditionalGeneration with turbomind inference #2710

deepindeed2022 commented Nov 5, 2024 •

edited

Loading

lvhan028 commented Nov 5, 2024

AllentDan left a comment

irexyc commented Nov 8, 2024

AllentDan left a comment

lvhan028 commented Nov 8, 2024

[Feature]: support LlavaForConditionalGeneration with turbomind inference #2710

[Feature]: support LlavaForConditionalGeneration with turbomind inference #2710

Conversation

deepindeed2022 commented Nov 5, 2024 • edited Loading

Motivation

Modification

Use cases (Optional)

lvhan028 commented Nov 5, 2024

AllentDan left a comment

Choose a reason for hiding this comment

irexyc commented Nov 8, 2024

AllentDan left a comment

Choose a reason for hiding this comment

lvhan028 commented Nov 8, 2024

deepindeed2022 commented Nov 5, 2024 •

edited

Loading