diff --git a/README.md b/README.md index 61c0eba45b..e45e3b687e 100644 --- a/README.md +++ b/README.md @@ -162,6 +162,7 @@ For detailed inference benchmarks in more devices and more settings, please refe
  • Phi-3-vision (4.2B)
  • Phi-3.5-vision (4.2B)
  • GLM-4V (9B)
  • +
  • Llama3.2-vision (11B, 90B)
  • diff --git a/README_ja.md b/README_ja.md index 999ebc9f0b..df4647d868 100644 --- a/README_ja.md +++ b/README_ja.md @@ -160,6 +160,7 @@ LMDeploy TurboMindエンジンは卓越した推論能力を持ち、さまざ
  • Phi-3-vision (4.2B)
  • Phi-3.5-vision (4.2B)
  • GLM-4V (9B)
  • +
  • Llama3.2-vision (11B, 90B)
  • diff --git a/README_zh-CN.md b/README_zh-CN.md index f002899c60..160af6f694 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -163,6 +163,7 @@ LMDeploy TurboMind 引擎拥有卓越的推理能力,在各种规模的模型
  • Phi-3-vision (4.2B)
  • Phi-3.5-vision (4.2B)
  • GLM-4V (9B)
  • +
  • Llama3.2-vision (11B, 90B)
  • diff --git a/docs/en/supported_models/supported_models.md b/docs/en/supported_models/supported_models.md index 260120efe0..620470a4de 100644 --- a/docs/en/supported_models/supported_models.md +++ b/docs/en/supported_models/supported_models.md @@ -20,6 +20,7 @@ The following tables detail the models supported by LMDeploy's TurboMind engine | Qwen1.5 | 1.8B - 110B | LLM | Yes | Yes | Yes | Yes | | Qwen2 | 1.5B - 72B | LLM | Yes | Yes | Yes | Yes | | Mistral | 7B | LLM | Yes | Yes | Yes | - | +| Mixtral | 8x7B, 8x22B | LLM | Yes | Yes | Yes | - | | Qwen-VL | 7B | MLLM | Yes | Yes | Yes | Yes | | DeepSeek-VL | 7B | MLLM | Yes | Yes | Yes | Yes | | Baichuan | 7B | LLM | Yes | Yes | Yes | Yes | @@ -60,7 +61,7 @@ The TurboMind engine doesn't support window attention. Therefore, for models tha | Falcon | 7B - 180B | LLM | Yes | Yes | Yes | No | No | | YI | 6B - 34B | LLM | Yes | Yes | Yes | No | Yes | | Mistral | 7B | LLM | Yes | Yes | Yes | No | No | -| Mixtral | 8x7B | LLM | Yes | Yes | Yes | No | No | +| Mixtral | 8x7B, 8x22B | LLM | Yes | Yes | Yes | No | No | | QWen | 1.8B - 72B | LLM | Yes | Yes | Yes | No | Yes | | QWen1.5 | 0.5B - 110B | LLM | Yes | Yes | Yes | No | Yes | | QWen1.5-MoE | A2.7B | LLM | Yes | Yes | Yes | No | No | diff --git a/docs/zh_cn/supported_models/supported_models.md b/docs/zh_cn/supported_models/supported_models.md index 26930cf3ce..d47909f541 100644 --- a/docs/zh_cn/supported_models/supported_models.md +++ b/docs/zh_cn/supported_models/supported_models.md @@ -20,6 +20,7 @@ | Qwen1.5 | 1.8B - 110B | LLM | Yes | Yes | Yes | Yes | | Qwen2 | 1.5B - 72B | LLM | Yes | Yes | Yes | Yes | | Mistral | 7B | LLM | Yes | Yes | Yes | - | +| Mixtral | 8x7B, 8x22B | LLM | Yes | Yes | Yes | - | | Qwen-VL | 7B | MLLM | Yes | Yes | Yes | Yes | | DeepSeek-VL | 7B | MLLM | Yes | Yes | Yes | Yes | | Baichuan | 7B | LLM | Yes | Yes | Yes | Yes | @@ -60,7 +61,7 @@ turbomind 引擎不支持 window attention。所以,对于应用了 window att | Falcon | 7B - 180B | LLM | Yes | Yes | Yes | No | No | | YI | 6B - 34B | LLM | Yes | Yes | Yes | No | Yes | | Mistral | 7B | LLM | Yes | Yes | Yes | No | No | -| Mixtral | 8x7B | LLM | Yes | Yes | Yes | No | No | +| Mixtral | 8x7B, 8x22B | LLM | Yes | Yes | Yes | No | No | | QWen | 1.8B - 72B | LLM | Yes | Yes | Yes | No | Yes | | QWen1.5 | 0.5B - 110B | LLM | Yes | Yes | Yes | No | Yes | | QWen1.5-MoE | A2.7B | LLM | Yes | Yes | Yes | No | No |