This repository provides code to experiment with training large models on Moreh's MoAI Platform. With the MoAI platform you can scale to thousands of GPU/NPUs by automatic parallelization and optimization, without any code changes.
We currently provide four LLMs; Llama3, Qwen2.5, Mistral, and Baichuan2, as well as SDXL.
This repository contains examples of PyTorch training codes that can be executed on the MoAI Platform. Users using Pytorch on the MoAI Platform can easily train large models without extensive effort. For more information about the MoAI Platform and detailed tutorials, please visit the Moreh Docs.
First, clone this repository and navigate to the repo directory.
git clone https://github.com/moreh-dev/quickstart
cd quickstart
After you are in the quickstart
directory, install the dependency packages by following commands :
pip install -r requirements/requirements_llm.txt
If you want to fine-tune the Llama2, Llama3, or Mistral models, you need access to their respective Hugging Face repositories. Please ensure you have the necessary acess before starting model training.
- Llama3 : https://huggingface.co/meta-llama/Meta-Llama-3-8B or https://huggingface.co/meta-llama/Meta-Llama-3-70B
- Mistral : https://huggingface.co/mistralai/Mistral-7B-v0.3
After obtaining access, authenticate your token with the following command:
huggingface-cli login
The following line is added in the each code to enable AP on the MoAI Platform.
...
torch.moreh.option.enable_advanced_parallelization()
...
Information about the models currently supported by this repository are as follows:
Baseline Model | Model Card Name |
---|---|
Llama3 8B | meta-llama/Meta-Llama-3-8B |
Llama3 70B | meta-llama/Meta-Llama-3-70B |
Qwen2.5 7B | Qwen/Qwen2.5-7B |
Mistral v0.3 7B | mistralai/Mistral-7B-v0.3 |
Baichuan2 13B | baichuan-inc/Baichuan2-13B-Base |
Run the training script to fully fine-tune the model. For example, if you want to fine-tune the llama-3 8B model:
TOKENIZERS_PARALLELISM=true accelerate launch --config_file config.yaml train_llm.py \
--lr 0.000001 \
--model meta-llama/Meta-Llama-3-8B \
--dataset bitext/Bitext-customer-support-llm-chatbot-training-dataset \
--train-batch-size 64 \
--eval-batch-size 64 \
--sequence-length 1024 \
--log-interval 10 \
--num-epochs 5 \
--output-dir llama3-finetuned
To train the LoRA adapter only, you can give a --lora
argument with LoRA config parameters.
TOKENIZERS_PARALLELISM=true accelerate launch --config_file config.yaml train_llm.py \
--lr 0.0001 \
--model meta-llama/Meta-Llama-3-8B \
--dataset bitext/Bitext-customer-support-llm-chatbot-training-dataset \
--train-batch-size 64 \
--eval-batch-size 64 \
--sequence-length 1024 \
--log-interval 10 \
--num-epochs 5 \
--lora \
--lora-r 64 \
--lora-alpha 16 \
--lora-dropout 0.1 \
--output-dir llama3-finetuned-lora
You can change model name in --model
arguments to fine-tune your desired model.
If you want to fine-tune your model with the other dataset, you can fix __call__
method of the Preprocessor
class which is defined in train_utils.py
to the desired format.
Perform inference by running the inference script for each model.
python inference_llm.py \
--model-name-or-path ${SAVE_DIR_PATH}
If you want to perform inference with LoRA weights, add --use-lora
argument to the inference script/
python inference_llm.py \
--model-name-or-path ${SAVE_DIR_PATH} \
--use-lora
# output example
##INSTRUCTION What is the status of my return for {{Order Number}}?
##RESPONSE Thank you for contacting us regarding the status of your return for order number {{Order Number}}. To provide you with accurate information, I kindly request you to visit the '{{Order Status}}' section on our website. There, you will find the most up-to-date details on the progress of your return. If you have any further questions or need additional assistance, please don't hesitate to let me know. I'm here to help you every step of the way!
We provide fine-tuning example code for the Stable Diffusion XL model.
Baseline Model | Task | Training Script | Dataset |
---|---|---|---|
Stable Diffusion XL | Text-to-Image Generation | tutorial/train_sdxl.py |
lambdalabs/naruto-blip-captions |
Run the training script for Stable Diffusion XL:
pip install -r requirements/requirements_sdxl.txt
python train_sdxl.py \
--epochs 20 \
--dataset-path lambdalabs/naruto-blip-captions \
--batch-size 16 \
--num-workers 8 \
--lr=1e-05 \
--save-dir=${SAVE_DIR_PATH} \
--log-interval 1 \
--lr-scheduler linear
python train_sdxl.py \
--epochs 20 \
--dataset-path lambdalabs/naruto-blip-captions \
--batch-size 16 \
--num-workers 8 \
--lr=1e-05 \
--save-dir=${LORA_WEIGHT_SAVE_DIR_PATH} \
--log-interval 1 \
--lr-scheduler linear
--lora \
--rank 32
python train_sdxl.py \
--epochs 20 \
--dataset-path lambdalabs/naruto-blip-captions \
--batch-size 16 \
--num-workers 8 \
--lr=1e-05 \
--save-dir=${LORA_WEIGHT_SAVE_DIR_PATH} \
--log-interval 1 \
--lr-scheduler linear
--lora \
--rank 32 \
--train-text-encoder
After training, you can proceed inference with your fine-tuned model using the following command:
python inference_sdxl.py \
--model-name-or-path=${SAVE_DIR_PATH}
python inference_sdxl.py \
--model-name-or-path=tabilityai/stable-diffusion-xl-base-1.0 \
--lora-weight=${LORA_WEIGHT_SAVE_DIR_PATH}
Adjust the prompt by editing the PROMPT variable in the inference script:
...
PROMPT = "Bill Gates with a hoodie"
...
The resulting image will be saved as sdxl_result.jpg
.
The image on the left shows the inference results of the model before fine-tuning, while the image on the right shows the inference results of the fine-tuned model.