How to reproduce the table 19 (kvquant vs kivi) #14

condy0919 · 2024-08-21T03:53:20Z

What's the link of LLaMA-2-7B-32K? https://huggingface.co/togethercomputer/LLaMA-2-7B-32K or https://huggingface.co/togethercomputer/Llama-2-7B-32K-Instruct ?

I'm attempting to run llama2-7b-32k with kvquant. When config.dynamicrope = False, python llama.py modelname wikitext2 --abits 3 --seqlen 4096 --maxseqlen 4096 --quantizer-path ./quantizers-llama2-7b-32k.pickle --include_sparse --sparsity-threshold 0.99 --first_few_fp16 5 raised

self.outliers[: self.klen] = outlier_vals
RuntimeError: The expanded size of the tensor (4096) must match the existing size (30545) at non-singleton dimension 0. Target sizes: [4096, 42]. Tensor sizes: [30545, 42]

When config.dynamicrope = True, it raised

TypeError: LlamaRotaryEmbedding.forward() takes from 2 to 3 positional arguments but 4 were given

How could I quantize llama2-7b-32k using kvquant?

Thanks.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reproduce the table 19 (kvquant vs kivi) #14

How to reproduce the table 19 (kvquant vs kivi) #14

condy0919 commented Aug 21, 2024 •

edited

Loading

How to reproduce the table 19 (kvquant vs kivi) #14

How to reproduce the table 19 (kvquant vs kivi) #14

Comments

condy0919 commented Aug 21, 2024 • edited Loading

condy0919 commented Aug 21, 2024 •

edited

Loading