You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I run finetune.sh on 8xA100(40G),it takes about 33 seconds per image.
Maybe the picture download is not complete, I skipped the missing picture, I think this may not be the reason why the training is so slow
here is the finetune.sh I use:
#!/bin/bash
Question
I run finetune.sh on 8xA100(40G),it takes about 33 seconds per image.
Maybe the picture download is not complete, I skipped the missing picture, I think this may not be the reason why the training is so slow
here is the finetune.sh I use:
#!/bin/bash
deepspeed llava/train/train_mem.py
--deepspeed ./scripts/zero3.json
--model_name_or_path /home/24-zhangtan/LLaVA/vicuna-7b-v1.5
--version v1
--data_path /home/24-zhangtan/LLaVA/playground/data/llava_v1_5_mix665k.json
--image_folder /home/24-zhangtan/LLaVA/playground/data
--vision_tower /home/24-zhangtan/LLaVA/clip-vit-large-patch14-336
--pretrain_mm_mlp_adapter /home/24-zhangtan/LLaVA/llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5/mm_projector.bin
--mm_projector_type mlp2x_gelu
--mm_vision_select_layer -2
--mm_use_im_start_end False
--mm_use_im_patch_token False
--image_aspect_ratio pad
--group_by_modality_length True
--bf16 True
--output_dir ./checkpoints/llava-v1.5-7b
--num_train_epochs 1
--per_device_train_batch_size 4
--per_device_eval_batch_size 4
--gradient_accumulation_steps 1
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 50000
--save_total_limit 1
--learning_rate 2e-5
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 2048
--gradient_checkpointing True
--dataloader_num_workers 4
--lazy_preprocess True
--report_to wandb
Has anyone had this problem? How did you solve it? Looking forward to receiving reply!
The text was updated successfully, but these errors were encountered: