-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The training loss #53
Comments
The loss scale is too large. Did you change the batch-size or num-gpus? |
@Zzzzz1 I use the original batch size 512 on 8 2080ti. After re-ran the code, I got the following results: |
@Vickeyhw How long does it take you to run an epoch please, I find it very strange that it takes me 100 minutes to run a 1/4 Epoch on 8*3090. |
@JinYu1998 23min/epoch. |
Thanks for your response, I think I've identified the problem. Since my data is not on SSD, the io issue is causing slow training... |
Thanks for your great work! When I run the code use:
python3 tools/train.py --cfg configs/imagenet/r34_r18/dot.yaml
The training loss is much larger than the kd method in the first few epochs, and the test acc is also low, is it normal?
The text was updated successfully, but these errors were encountered: