LMDeploy Release V0.4.1
What's Changed
🚀 Features
- Add colab demo by @AllentDan in #1428
- support starcoder2 by @grimoire in #1468
- support OpenGVLab/InternVL-Chat-V1-5 by @irexyc in #1490
💥 Improvements
- variable
CTA_H
& fix qkv bias by @lzhangzz in #1491 - refactor vision model loading by @irexyc in #1482
- fix installation requirements for windows by @irexyc in #1531
- Remove split batch inside pipline inference function by @AllentDan in #1507
- Remove first empty chunck for api_server by @AllentDan in #1527
- add benchmark script to profile pipeline APIs by @lvhan028 in #1528
- Add input validation by @AllentDan in #1525
🐞 Bug fixes
- fix local variable 'response' referenced before assignment in async_engine.generate by @irexyc in #1513
- Fix turbomind import in windows by @irexyc in #1533
- Fix convert qwen2 to turbomind by @AllentDan in #1546
- Adding api_key and model_name parameters to the restful benchmark by @NiuBlibing in #1478
📚 Documentations
- update supported models for Baichuan by @zhyncs in #1485
- Fix typo in w8a8.md by @Infinity4B in #1523
- complete build.md by @YanxingLiu in #1508
- update readme wechat qrcode by @vansin in #1529
- Update docker docs for VL api by @vody-am in #1534
- Format supported model table using html syntax by @lvhan028 in #1493
- doc: add example of deploying api server to Kubernetes by @uzuku in #1488
🌐 Other
- add modelscope and lora testcase by @zhulinJulia24 in #1506
- bump version to v0.4.1 by @lvhan028 in #1544
New Contributors
- @NiuBlibing made their first contribution in #1478
- @Infinity4B made their first contribution in #1523
- @YanxingLiu made their first contribution in #1508
- @vody-am made their first contribution in #1534
- @uzuku made their first contribution in #1488
Full Changelog: v0.4.0...v0.4.1