Releases · NeoZhangJianyu/llama.cpp

23 Apr 01:25

4e96a81

b2716

[SYCL] Windows default build instructions without -DLLAMA_SYCL_F16 fl…

Assets 19

17 Apr 02:08

github-actions

b2688

facb8b5

b2688

convert : fix autoawq gemma (#6704)

* fix autoawq quantized gemma model convert error

using autoawq to quantize gemma model will include a lm_head.weight tensor in model-00001-of-00002.safetensors. it result in this situation that convert-hf-to-gguf.py can't map lm_head.weight. skip loading this tensor could prevent this error.

* change code to full string match and print necessary message

change code to full string match and print a short message to inform users that lm_head.weight has been skipped.

---------

Co-authored-by: Zheng.Deng <[email protected]>

Assets 18

16 Apr 15:17

github-actions

b2682

8a56075

b2682

gritlm : add --outdir option to hf.sh script (#6699)

This commit updates the hf.sh script usage to include the --outdir option
and specifies the models directory as the output directory.

The motivation for this is to avoid cluttering the root directory with
model files.

Signed-off-by: Daniel Bevenius <[email protected]>

Assets 18

15 Apr 10:30

github-actions

b2675

17e98d4

b2675

fix mul_mat_id() for new input, make the ut pass (#6682)

Assets 18

15 Apr 06:10

github-actions

b2674

1958f7e

b2674

llama : add missing kv clear in llama_beam_search (#6664)

Assets 18

11 Apr 09:42

github-actions

b2647

8228b66

b2647

gguf : add option to not check tensor data (#6582)

This commit adds an option to the gguf example to not check the tensor
data.

The motivation for this is that it can be nice to use the gguf tool to
read other .gguf files that were not created by the gguf tool.

Signed-off-by: Daniel Bevenius <[email protected]>

Assets 2

09 Apr 01:27

github-actions

b2634

cc4a954

b2634

llama : fix attention layer count sanity check (#6550)

* llama : fix attention layer count sanity check

* llama : fix parentheses in attention layer count sanity check

There was otherwise a warning when compiling.

---------

Co-authored-by: Francis Couture-Harpin <[email protected]>

Assets 2

31 Mar 06:15

github-actions

b2581

37e7854

b2581

ci: bench: fix Resource not accessible by integration on PR event (#6…

Assets 18

28 Mar 01:29

github-actions

b2554

25f4a61

b2554

[SYCL] fix set main gpu crash (#6339)

Assets 18

27 Mar 07:21

github-actions

b2543

0642b22

b2543

server: public: use relative routes for static files (#6325)

server: public: support custom `api_url`, default to relative base path

Assets 18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: NeoZhangJianyu/llama.cpp

b2716

b2688

b2682

b2675

b2674

b2647

b2634

b2581

b2554

b2543