GitHub - mallik3006/LLM_fine_tuning_llama3_8b: Fine-Tuning Llama3-8B LLM in a multi-GPU environment using DeepSpeed

LLM_fine_tuning_llama3_8b

Introduction

The repo is a demonstration of fine tuning an open-source LLM (Llama-3-8B) utilizing different approaches and techniques. Fine-Tuning was done with ORPO technique that combines SFT and RLHF methods for preference alignment. The work explores fine tuning on multi-GPU environment utilizing distributed training methods like DeepSpeed, DDP and FSDP using the accelerate library provided by HuggingFace.

Stack

LLM - Meta-Llama-3-8B
Dataset (HF) - mlabonne/orpo-dpo-mix-40k
Fine-Tuning Method - ORPO
Accelerator Technique - DeepSpeed ZeRO-3
Trainer API - HuggingFace
Run-time environment - multi-GPU (2x Telsa T4 GPU - 15GB VRAM each)

Acknowledgments

Thanks for the work shared by Maxime Labonn in his blog here.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
ds_config_zero3.json		ds_config_zero3.json
llama3-ft-orpo_ddp.ipynb		llama3-ft-orpo_ddp.ipynb
llm_llama3_fine_tuning_orpo.ipynb		llm_llama3_fine_tuning_orpo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM_fine_tuning_llama3_8b

Introduction

Stack

Acknowledgments

About

Releases

Packages

Languages

mallik3006/LLM_fine_tuning_llama3_8b

Folders and files

Latest commit

History

Repository files navigation

LLM_fine_tuning_llama3_8b

Introduction

Stack

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages