Home

LLM Distillation Wiki

This project focuses on the distillation of one or multiple teacher language models into a single student model. The goal is to leverage the knowledge of various teacher models and encapsulate it into a more compact and efficient student model.

Introduction

Model distillation is a technique where a smaller, student model learns to mimic the behavior of larger, teacher models. This process helps in creating models that are lightweight and fast without significant loss of accuracy. This project allows you to distill multiple teacher models into one student model, making it versatile for various applications.

Getting Started

Prerequisites

Before you start, ensure you have the following installed:

Python 3.10+ (recommended is 3.10.11)
NVIDIA CUDA Toolkit 12.1

Installation

Clone the repository:

git clone https://github.com/yourusername/LLM-Distillation.git

cd LLM-Distillation

Install the required packages:

pip install -r requirements.txt

What to do next?

Next, check out this page to prepare everything for your first distillation run: Preparations for the first start

Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

License

This project is licensed under Apache License 2.0. See the LICENSE file for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly