Skip to content

Latest commit

 

History

History
57 lines (43 loc) · 1.21 KB

README.md

File metadata and controls

57 lines (43 loc) · 1.21 KB

Quantizers

Quantizers is a library that provides an easy-to-use interface for quantizing LLMs into various formats by using YAML configs.

Supported Operating Systems

  • Linux
  • Windows
  • macOS

Supported Quantizations

  • GGUF
  • ExLlamaV2
  • GPTQ
  • AWQ
  • AQLM
  • QuIP
  • QuIP#
  • HQQ
  • HQQ+
  • SqueezeLLM
  • Marlin
  • EETQ
  • SmoothQuant
  • Bitsandbytes
  • TensorRT-LLM

Installation

To get started, clone the repo recursively:

git clone https://github.com/PygmalionAI/quantizers.git
cd quantizers
git submodule update --init --recursive
python3 -m pip install -e .
python3 -m pip install -r requirements.txt

To build with GPU support (currently for imatrix only), run this instead:

LLAMA_CUBLAS=1 python3 -m pip install -e .

Usage

Only GGUF is supported for now. You will need a YAML config file. An example is provided in the examples directory.

Once you've filled out your YAML file, run:

quantizers examples/gguf/config.yaml

Contribution

At the moment, we don't accept feature contributions until we've finished supporting all the planned quantization methods. PRs for bug fixes and OS support are welcome!