v0.1.6
What's Changed
- alllow deploy docs when triggered on demand by @guocuimi in #253
- [model] support vision language model llava. by @liutongxuan in #178
- dev: fix issues in run_in_docker script by @guocuimi in #254
- dev: added cuda 12.4 build support by @guocuimi in #255
- build: fix multiple definition issue by @guocuimi in #256
- fix: check against num_tokens instead of num_prompt_tokens for shared blocks by @guocuimi in #257
- bugfix: fix invalid max_cache_size when device is cpu. by @liutongxuan in #259
- ci: fail test if not all tests were passed successfully by @guocuimi in #263
- Revert "[model] support vision language model llava. (#178)" by @guocuimi in #262
Full Changelog: v0.1.5...v0.1.6