play-with-LLM-LoRA

a personal note about using LLM LoRA for finetuning

LoRA finetuning is efficient for large-scale LLM, however, there are many hiccups or bumps in the road when playing with LoRA. Thus, I'd like to share some training scripts/problems I met/solutions here. If you find these notes helpful, plz leave a star!

My running environment: one server with 2 A100 80GB and one server with 3 A40 40GB. However, when playing with QLoRA with A40, loading the 4bit quantized model will cause RuntimeError: CUDA error: an illegal memory access was encountered. I have no idea why this happens. There's an issue thread discussing this, however, no solution can solve the problem, including downgrading pytorch. I will update this note if I find the reason. Thus, the experiments are running on A100 server.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
LoRA-related		LoRA-related
guanaco_lora		guanaco_lora
multi-gpu-related		multi-gpu-related
roberta_lora		roberta_lora
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

play-with-LLM-LoRA

About

Releases

Packages

Languages

sherdencooper/play-with-LLM-lora

Folders and files

Latest commit

History

Repository files navigation

play-with-LLM-LoRA

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages