Releases: NeoZhangJianyu/llama.cpp
Releases · NeoZhangJianyu/llama.cpp
b2716
[SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 fl…
b2688
convert : fix autoawq gemma (#6704) * fix autoawq quantized gemma model convert error using autoawq to quantize gemma model will include a lm_head.weight tensor in model-00001-of-00002.safetensors. it result in this situation that convert-hf-to-gguf.py can't map lm_head.weight. skip loading this tensor could prevent this error. * change code to full string match and print necessary message change code to full string match and print a short message to inform users that lm_head.weight has been skipped. --------- Co-authored-by: Zheng.Deng <[email protected]>
b2682
gritlm : add --outdir option to hf.sh script (#6699) This commit updates the hf.sh script usage to include the --outdir option and specifies the models directory as the output directory. The motivation for this is to avoid cluttering the root directory with model files. Signed-off-by: Daniel Bevenius <[email protected]>
b2675
fix mul_mat_id() for new input, make the ut pass (#6682)
b2674
llama : add missing kv clear in llama_beam_search (#6664)
b2647
gguf : add option to not check tensor data (#6582) This commit adds an option to the gguf example to not check the tensor data. The motivation for this is that it can be nice to use the gguf tool to read other .gguf files that were not created by the gguf tool. Signed-off-by: Daniel Bevenius <[email protected]>
b2634
llama : fix attention layer count sanity check (#6550) * llama : fix attention layer count sanity check * llama : fix parentheses in attention layer count sanity check There was otherwise a warning when compiling. --------- Co-authored-by: Francis Couture-Harpin <[email protected]>
b2581
ci: bench: fix Resource not accessible by integration on PR event (#6…
b2554
[SYCL] fix set main gpu crash (#6339)
b2543
server: public: use relative routes for static files (#6325) server: public: support custom `api_url`, default to relative base path