LMDeploy Release V0.4.1

lvhan028 released this 07 May 08:20

· 505 commits to main since this release

14e9953

What's Changed

🚀 Features

Add colab demo by @AllentDan in #1428
support starcoder2 by @grimoire in #1468
support OpenGVLab/InternVL-Chat-V1-5 by @irexyc in #1490

💥 Improvements

variable CTA_H & fix qkv bias by @lzhangzz in #1491
refactor vision model loading by @irexyc in #1482
fix installation requirements for windows by @irexyc in #1531
Remove split batch inside pipline inference function by @AllentDan in #1507
Remove first empty chunck for api_server by @AllentDan in #1527
add benchmark script to profile pipeline APIs by @lvhan028 in #1528
Add input validation by @AllentDan in #1525

🐞 Bug fixes

fix local variable 'response' referenced before assignment in async_engine.generate by @irexyc in #1513
Fix turbomind import in windows by @irexyc in #1533
Fix convert qwen2 to turbomind by @AllentDan in #1546
Adding api_key and model_name parameters to the restful benchmark by @NiuBlibing in #1478

📚 Documentations

update supported models for Baichuan by @zhyncs in #1485
Fix typo in w8a8.md by @Infinity4B in #1523
complete build.md by @YanxingLiu in #1508
update readme wechat qrcode by @vansin in #1529
Update docker docs for VL api by @vody-am in #1534
Format supported model table using html syntax by @lvhan028 in #1493
doc: add example of deploying api server to Kubernetes by @uzuku in #1488

🌐 Other

add modelscope and lora testcase by @zhulinJulia24 in #1506
bump version to v0.4.1 by @lvhan028 in #1544

New Contributors

@NiuBlibing made their first contribution in #1478
@Infinity4B made their first contribution in #1523
@YanxingLiu made their first contribution in #1508
@vody-am made their first contribution in #1534
@uzuku made their first contribution in #1488

Full Changelog: v0.4.0...v0.4.1

Contributors

grimoire, lvhan028, and 11 other contributors

Assets 10