Z-FOLD

This repository contains an implementation of our EMNLP 2023 paper "Z-FOLD: A Frustratingly Easy Post-Training Quantization Scheme for LLMs".

Usage

You can reproduce the experiment using the script provided below.

Setup Environment

pip install -r requirements.txt

All experiments were run on a single 80GB NVIDIA A100.

OPT Model Quantization

# model: facebook/opt-125m, facebook/opt-350m, facebook/opt-1.3b, facebook/opt-125m, facebook/opt-2.7b ...
python3 opt.py --model facebook/opt-125m --wbits 4 --use-hessian --use-zfold
python3 opt.py --model facebook/opt-125m --wbits 3 --use-hessian --use-zfold
python3 opt.py --model facebook/opt-125m --wbits 2 --use-hessian --use-zfold

BLOOM Model Quantization

# model: bigscience/bloom-1b7 bigscience/bloom-3b bigscience/bloom-7b1, ...
python3 bloom.py --model bigscience/bloom-560m --wbits 4 --use-hessian --use-zfold
python3 bloom.py --model bigscience/bloom-560m --wbits 3 --use-hessian --use-zfold
python3 bloom.py --model bigscience/bloom-560m --wbits 2 --use-hessian --use-zfold

LLAMA Model Quantization

python3 llama.py --model ${MODEL_DIR} --wbits 4 --act-order --use-hessian --use-zfold
python3 llama.py --model ${MODEL_DIR} --wbits 3 --act-order --use-hessian --use-zfold
python3 llama.py --model ${MODEL_DIR} --wbits 2 --act-order --use-hessian --use-zfold

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{jeon2023zfold,
    author    = "Jeon, Yongkweon and Lee, Chungman and Park, Kyungphil and Kim, Ho-young",
    title     = "A Frustratingly Easy Post-Training Quantization Scheme for {LLM}s",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    year      = "2023",
    url       = "https://aclanthology.org/2023.emnlp-main.892",
}

License

This project is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ACKNOWLEDGEMENTS		ACKNOWLEDGEMENTS
LICENSE		LICENSE
README.md		README.md
bloom.py		bloom.py
datautils.py		datautils.py
gptq.py		gptq.py
llama.py		llama.py
opt.py		opt.py
quant.py		quant.py
requirements.txt		requirements.txt
zfold.py		zfold.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Z-FOLD

Usage

Citation

License

About

Releases

Packages

Contributors 2

Languages

License

SamsungLabs/Z-Fold

Folders and files

Latest commit

History

Repository files navigation

Z-FOLD

Usage

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages