-
Notifications
You must be signed in to change notification settings - Fork 10
Home
This project focuses on the distillation of one or multiple teacher language models into a single student model. The goal is to leverage the knowledge of various teacher models and encapsulate it into a more compact and efficient student model.
Model distillation is a technique where a smaller, student model learns to mimic the behavior of larger, teacher models. This process helps in creating models that are lightweight and fast without significant loss of accuracy. This project allows you to distill multiple teacher models into one student model, making it versatile for various applications.
Before you start, ensure you have the following installed:
- Python 3.10+ (recommended is 3.10.11)
- NVIDIA CUDA Toolkit 12.1
Clone the repository:
git clone https://github.com/yourusername/LLM-Distillation.git
cd LLM-Distillation
Install the required packages:
pip install -r requirements.txt
Next, check out this page to prepare everything for your first distillation run: Preparations for the first start
Contributions are welcome! Feel free to open issues or submit pull requests.
This project is licensed under Apache License 2.0. See the LICENSE file for more details.