Release v0.1.6 · vectorch-ai/ScaleLLM

What's Changed

alllow deploy docs when triggered on demand by @guocuimi in #253
[model] support vision language model llava. by @liutongxuan in #178
dev: fix issues in run_in_docker script by @guocuimi in #254
dev: added cuda 12.4 build support by @guocuimi in #255
build: fix multiple definition issue by @guocuimi in #256
fix: check against num_tokens instead of num_prompt_tokens for shared blocks by @guocuimi in #257
bugfix: fix invalid max_cache_size when device is cpu. by @liutongxuan in #259
ci: fail test if not all tests were passed successfully by @guocuimi in #263
Revert "[model] support vision language model llava. (#178)" by @guocuimi in #262

Full Changelog: v0.1.5...v0.1.6