Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train VMamba2 but gradients turned into NaN #300

Open
Mark2Lutz opened this issue Sep 14, 2024 · 1 comment
Open

Train VMamba2 but gradients turned into NaN #300

Mark2Lutz opened this issue Sep 14, 2024 · 1 comment

Comments

@Mark2Lutz
Copy link

Hi, thanks for your great work, I set up the training with vmamba2, but after a period of time, the gradients turned into NaN. Even with gradient clipping, NaN still occurs. Have you encountered this situation before, and do you have any good solutions?

@MzeroMiko
Copy link
Owner

If you are talking about VMamba with mamba2, then I have not trained it due to it's slow training on my device.

If you are talking about VMamba with SS2Dv2, then I have not encountered the NaN problem before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants