Want a real life example of how to put the Training Toolkit to practice? Check out our demo projects:
- ♻️ Trash Sorting Assistant: Helps you navigate the uncharted terrain of properly sorting your waste items. Input a photo of your garbage and local waste disposal rules from wherever you are on the planet. The model will highlight the items using a segmentation adapter, and then cross-reference them with the rules using the RAG one.
The Training Toolkit is made and tested using Python 3.12.
- Run
pip3 install -r requirements.txt
- Note for MacOS: the toolkit is using
decord
to load video. This library is no longer maintained, and it's incompatible with MacOS.
-
[optional] Create a
.env
file and put your Hugging Face access token in it:HF_TOKEN = your_token
. This is necessary to use models out of gated repositories such as PaliGemma. -
[optional] Run
pip3 install flash-attn --no-build-isolation
to enable Flash Attention 2 on CUDA devices.
from training_toolkit import build_trainer, paligemma_image_preset, image_json_preset
trainer = build_trainer(
**image_json_preset.with_path("path/to/dataset").as_kwargs(),
**paligemma_image_preset.as_kwargs()
)
trainer.train()
There're two primary parts of the Training Toolkit: a ModelPreset
and a Data Preset
.
These are dataclasses that contain model settings, training arguments, collators etc. Combined they provide everything necessary to set up a Trainer with reasonable defaults. You can directly access and override all settings before feeding them to a builder.
paligemma_image_preset.use_lora = True
paligemma_image_preset.training_args["per_device_train_batch_size"] = 8
The process of building a trainer is handled by the build_trainer
factory function. It instantiates the model, enables adapters, runs preprocessing on the data and wraps it all into a Trainer. You can can get all the necessary arguments using the .as_kwargs()
method of the preset, or you can pass them directly.
You can access individual components of a traner directly as its properties.
model = trainer.model
processor = trainer.tokenizer
eval_dataset = trainer.eval_dataset
Check out the cookbook section of this repository to find walkthroughs for supported usecases.