LMDeploy Release V0.5.3

lvhan028 released this 07 Aug 03:38

· 310 commits to main since this release

a129a14

What's Changed

🚀 Features

PyTorch Engine AWQ support by @grimoire in #1913
Phi3 awq by @grimoire in #1984
Fix chunked prefill by @lzhangzz in #2201
support VLMs with Qwen as the language model by @irexyc in #2207

💥 Improvements

Support specifying a prefix of assistant response by @AllentDan in #2172
Strict check for name_map in InternLM2Chat7B by @SamuraiBUPT in #2156
Check errors for attention kernels by @lzhangzz in #2206
update base image to support cuda12.4 in dockerfile by @RunningLeon in #2182
Stop synchronizing for length_criterion by @lzhangzz in #2202
adapt MiniCPM-Llama3-V-2_5 new code by @irexyc in #2139
Remove duplicate code by @cmpute in #2133

🐞 Bug fixes

[Hotfix] miss parentheses when calcuating the coef of llama3 rope by @lvhan028 in #2157
support logit softcap by @grimoire in #2158
Fix gmem to smem WAW conflict in awq gemm kernel by @foreverrookie in #2111
Fix gradio serve using a wrong chat template by @AllentDan in #2131
fix runtime error when using dynamic scale rotary embed for InternLM2… by @CyCle1024 in #2212
Add peer-access-enabled allocator by @lzhangzz in #2218
Fix typos in profile_generation.py by @jiajie-yang in #2233

📚 Documentations

docs: fix Qwen typo by @ArtificialZeng in #2136
wrong expression by @ArtificialZeng in #2165
clearify the model type LLM or MLLM in supported model matrix by @lvhan028 in #2209
docs: add Japanese README by @eltociear in #2237

🌐 Other

bump version to 0.5.2.post1 by @lvhan028 in #2159
update news about cooperation with modelscope/swift by @lvhan028 in #2200
bump version to v0.5.3 by @lvhan028 in #2242

New Contributors

@ArtificialZeng made their first contribution in #2136
@foreverrookie made their first contribution in #2111
@SamuraiBUPT made their first contribution in #2156
@CyCle1024 made their first contribution in #2212
@jiajie-yang made their first contribution in #2233
@cmpute made their first contribution in #2133

Full Changelog: v0.5.2...v0.5.3

Contributors

grimoire, lvhan028, and 11 other contributors

Assets 12