Why is axolotl reporting near-zero GPU memory usage while training? (FSDP Llama 3.1 8B Liger example) #2284
Unanswered
mashdragon
asked this question in
Q&A
Replies: 1 comment 2 replies
-
Hey, the number seems very unusual even though it's at the start of training. The cache does show being increased though. We pull data from
High RAM use could be due to FSDP offloading + Liger offload. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is what I see in the logs. Is it accurate? I'm curious why so little memory is being used for training.
My swap space is getting hammered. (2x24 GB GPU, 96 GB RAM, 60 GB swap)
Beta Was this translation helpful? Give feedback.
All reactions