使用 lmdeploy 部署 OpenGVLab/InternVL2-40B 需要多少显存 #2276

navono · 2024-08-09T08:06:16Z

navono
Aug 9, 2024

机器有 4 张 4090D，

from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
from lmdeploy.vl import load_image


model = 'OpenGVLab/InternVL2-40B'

system_prompt = '我是书生·万象，英文名是InternVL，是由上海人工智能实验室、清华大学及多家合作单位联合开发的多模态大语言模型。'
chat_template_config = ChatTemplateConfig('internvl-zh-hermes2')
chat_template_config.meta_instruction = system_prompt


def chat(instruction, img):
    pipe = pipeline(model, chat_template_config=chat_template_config,
                    backend_config=TurbomindEngineConfig(session_len=8192, tp=4))
    response = pipe((instruction, img))
    print(response.text)


def main():
    image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
    chat('describe this image', image)


if __name__ == '__main__':
    main()

运行后，错误：
RuntimeError: handle_0 INTERNAL ASSERT FAILED at "../c10/cuda/driver_api.cpp":15, please report a bug to PyTorch.

nvidiam-smi 显示 4 张卡的显存均已满。

Answered by navono

Aug 12, 2024

降低 --cache-max-entry-count 参数可运行

View full answer

navono · 2024-08-09T08:12:26Z

navono
Aug 9, 2024
Author

正在下载 AWQ 版本

0 replies

navono · 2024-08-12T05:36:11Z

navono
Aug 12, 2024
Author

降低 --cache-max-entry-count 参数可运行

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用 lmdeploy 部署 OpenGVLab/InternVL2-40B 需要多少显存 #2276

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

使用 lmdeploy 部署 OpenGVLab/InternVL2-40B 需要多少显存 #2276

navono Aug 9, 2024

Replies: 2 comments

navono Aug 9, 2024 Author

navono Aug 12, 2024 Author

navono
Aug 9, 2024

navono
Aug 9, 2024
Author

navono
Aug 12, 2024
Author