Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue training for male dataset #245

Open
zshakeri opened this issue Jan 7, 2021 · 1 comment
Open

Issue training for male dataset #245

zshakeri opened this issue Jan 7, 2021 · 1 comment

Comments

@zshakeri
Copy link

zshakeri commented Jan 7, 2021

I have been trying to train the model for a male dataset. I've tried training from scratch and finetuning the provided checkpoint. I tried with the default parameters (batchsize 3 - 8GPUs) and increasing batch size to 32 on 8 GPUs and playing around with the lr. In all cases, the error saturates to -5 around 5k-20k steps and then either increases or blows up. Do you have any suggestions what to do in this case? Have you trained the model for any dataset other than LJ?
Examples of training loss curves:
Screen Shot 2020-12-16 at 10 38 57 AM
Screen Shot 2021-01-07 at 11 47 22 AM

@rafaelvalle
Copy link
Contributor

rafaelvalle commented Jan 8, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants