Skip to content
golololologol edited this page Jun 22, 2024 · 16 revisions

LLM Distillation Wiki

This project focuses on the distillation of one or multiple teacher language models into a single student model. The goal is to leverage the knowledge of various teacher models and encapsulate it into a more compact and efficient student model.

Table of Contents

Introduction

Model distillation is a technique where a smaller, student model learns to mimic the behavior of larger, teacher models. This process helps in creating models that are lightweight and fast without significant loss of accuracy. This project allows you to distill multiple teacher models into one student model, making it versatile for various applications.

Getting Started

Prerequisites

Before you start, ensure you have the following installed:

  • Python 3.10+ (recommended is 3.10.11)
  • NVIDIA CUDA Toolkit 12.1

Installation

Clone the repository:

git clone https://github.com/yourusername/LLM-Distillation.git

cd LLM-Distillation

Install the required packages:

pip install -r requirements.txt

What to do next?

Next, check out this page to prepare everything for your first distillation run: Preparations for the first start

Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

License

This project is licensed under Apache License 2.0. See the LICENSE file for more details.

Clone this wiki locally