Skip to content
golololologol edited this page Jan 24, 2025 · 16 revisions

LLM-Distillery Wiki

This project focuses on distillation of one or multiple teacher language models into a single student model. The goal is to collect general knowledge of various teacher models and encapsulate it into a more compact and efficient student model.

Table of Contents

Getting Started

Prerequisites

Before you start, ensure you have the following installed:

Installation

Clone the repository:

git clone https://github.com/golololologol/LLM-Distillery

cd LLM-Distillery

Inside the folder, run open_venv.bat for Windows
For Linux:

chmod +x open_venv.sh
./open_venv.sh

This will create a virtual environment, and will keep the activated venv open for you to manually install the following packages into it:

  • PyTorch 2.2.0+ (2.2.0 is recommended):
    2.2.0, newest

  • Exllamav2 0.0.19+ (0.2.7 is recommended):
    Choose the correct version for your particular setup of CUDA Toolkit, PyTorch, Python, and OS
    https://github.com/turboderp/exllamav2/releases

  • bitsandbytes 0.41.3+ (0.45.1 is recommended):
    For Linux, you can skip this step, and get to the next one straight away.
    For Windows, manually install bitsandbytes 0.45.1+ from here using the following command:

pip install --no-deps https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-0.45.1-py3-none-win_amd64.whl
  • Flash Attention 2 2.4.2+ (2.5.2 is recommended):
    If you have python not 3.10.x, then lurk through here to find a fitting wheel.
pip install https://github.com/bdashore3/flash-attention/releases/download/v2.5.2/flash_attn-2.5.2+cu122torch2.2.0cxx11abiFALSE-cp310-cp310-win_amd64.whl

Now finally, install all other necessary packages into the venv:

pip install -r requirements.txt

What to do next?

Check out this page to prepare everything for your first distillation run: Preparations for the first start

Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

License

This project is licensed under Apache License 2.0. See the LICENSE file for more details.