saving 70B checkpoint takes 1000s for full finetuning 8GPU #1735
Labels
better engineering
Tasks which help improve eng productivity e.g. building tools, cleaning up code, writing docs
discussion
Start a discussion
tune run --nproc_per_node 8 full_finetune_distributed --config llama3_1/70B_full max_steps_per_epoch=20
The text was updated successfully, but these errors were encountered: