[New Model]: Phi-4 Multimodal Instruct #13936

lhcavalcanti · 2025-02-27T00:05:11Z

The model to consider.

New Phi 4 Multimodal: https://huggingface.co/microsoft/Phi-4-multimodal-instruct

The closest model vllm already supports.

https://docs.vllm.ai/en/latest/models/supported_models.html#list-of-multimodal-language-models

What's your difficulty of supporting the model you want?

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

lhcavalcanti · 2025-02-27T01:40:48Z

Error when running it with vllm 0.7.3:

NFO 02-27 01:39:00 model_runner.py:1110] Starting to load model /models/Phi-4-multimodal-instruct...
/opt/miniconda/envs/python39/lib/python3.9/site-packages/transformers/models/auto/image_processing_auto.py:594: FutureWarning: The image_processor_class argument is deprecated and will be removed in v4.42. Please use `slow_image_processor_class`, or `fast_image_processor_class` instead
  warnings.warn(
[rank0]: Traceback (most recent call last):
[rank0]:   File "/code/score.py", line 250, in <module>
[rank0]:     engine = AsyncLLMEngine.from_engine_args(engine_args, stat_loggers={"Geneva": GenevaStatsLogger()})
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/engine/async_llm_engine.py", line 644, in from_engine_args
[rank0]:     engine = cls(
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/engine/async_llm_engine.py", line 594, in __init__
[rank0]:     self.engine = self._engine_class(*args, **kwargs)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/engine/async_llm_engine.py", line 267, in __init__
[rank0]:     super().__init__(*args, **kwargs)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/engine/llm_engine.py", line 273, in __init__
[rank0]:     self.model_executor = executor_class(vllm_config=vllm_config, )
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/executor/executor_base.py", line 52, in __init__
[rank0]:     self._init_executor()
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/executor/uniproc_executor.py", line 47, in _init_executor
[rank0]:     self.collective_rpc("load_model")
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
[rank0]:     answer = run_method(self.driver_worker, method, args, kwargs)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/utils.py", line 2196, in run_method
[rank0]:     return func(*args, **kwargs)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/worker/worker.py", line 183, in load_model
[rank0]:     self.model_runner.load_model()
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/worker/model_runner.py", line 1112, in load_model
[rank0]:     self.model = get_model(vllm_config=self.vllm_config)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/model_executor/model_loader/__init__.py", line 14, in get_model
[rank0]:     return loader.load_model(vllm_config=vllm_config)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/model_executor/model_loader/loader.py", line 406, in load_model
[rank0]:     model = _initialize_model(vllm_config=vllm_config)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/model_executor/model_loader/loader.py", line 115, in _initialize_model
[rank0]:     model_class, _ = get_model_architecture(model_config)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/model_executor/model_loader/utils.py", line 106, in get_model_architecture
[rank0]:     architectures = resolve_transformers_fallback(model_config,
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/model_executor/model_loader/utils.py", line 60, in resolve_transformers_fallback
[rank0]:     auto_modules = {
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/vllm/model_executor/model_loader/utils.py", line 61, in <dictcomp>
[rank0]:     name: get_class_from_dynamic_module(module, model_config.model)
[rank0]:   File "/opt/miniconda/envs/python39/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 536, in get_class_from_dynamic_module
[rank0]:     module_file, class_name = class_reference.split(".")
[rank0]: ValueError: not enough values to unpack (expected 2, got 1)

DarkLight1337 · 2025-02-27T04:27:30Z

Support is coming very soon!

lhcavalcanti · 2025-02-28T23:04:00Z

Thanks for the update @DarkLight1337. Do you have any ETA / rough estimate? Thank you!

DarkLight1337 · 2025-03-03T11:59:01Z

We now have an official PR #14119 under review, feel free to check it out!

congcongchen123 · 2025-03-05T09:20:04Z

Feel free to check out the PR description here for steps on:
1. Starting the server with the base model and vision/speech LoRA weights.
2. Sending requests to the OpenAI-compatible server.

congcongchen123 · 2025-03-05T20:01:02Z

We also found that LoRA(Punica) kernels are quite slow for Phi-4-multimodal-instruct. Here’s the fix: PR #14272. With this fix, we observed up to a 5x improvement in generation speed.

lhcavalcanti added the new model Requests to new models label Feb 27, 2025

DarkLight1337 mentioned this issue Feb 27, 2025

[RFC]: Multi-modality Support on vLLM #4194

Open

46 tasks

DarkLight1337 mentioned this issue Mar 3, 2025

[Model] New model support for Phi-4-multimodal-instruct #14119

Merged

vllm-bot closed this as completed in #14119 Mar 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Model]: Phi-4 Multimodal Instruct #13936

[New Model]: Phi-4 Multimodal Instruct #13936

lhcavalcanti commented Feb 27, 2025

lhcavalcanti commented Feb 27, 2025 •

edited

Loading

DarkLight1337 commented Feb 27, 2025

lhcavalcanti commented Feb 28, 2025

DarkLight1337 commented Mar 3, 2025 •

edited

Loading

congcongchen123 commented Mar 5, 2025

congcongchen123 commented Mar 5, 2025

[New Model]: Phi-4 Multimodal Instruct #13936

[New Model]: Phi-4 Multimodal Instruct #13936

Comments

lhcavalcanti commented Feb 27, 2025

The model to consider.

The closest model vllm already supports.

What's your difficulty of supporting the model you want?

Before submitting a new issue...

lhcavalcanti commented Feb 27, 2025 • edited Loading

DarkLight1337 commented Feb 27, 2025

lhcavalcanti commented Feb 28, 2025

DarkLight1337 commented Mar 3, 2025 • edited Loading

congcongchen123 commented Mar 5, 2025

congcongchen123 commented Mar 5, 2025

lhcavalcanti commented Feb 27, 2025 •

edited

Loading

DarkLight1337 commented Mar 3, 2025 •

edited

Loading