Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory #161

Open
EddyPianist opened this issue Dec 13, 2024 · 6 comments
Open

CUDA out of memory #161

EddyPianist opened this issue Dec 13, 2024 · 6 comments

Comments

@EddyPianist
Copy link

Hi there,

I'm encountering a CUDA out of memory error while fine-tuning the stable-audio-open model, even with the batch size set to 1. I'm using an NVIDIA A10 GPU with 24 GB of memory, which I believe should be sufficient for this task.

I’m wondering if anyone else has encountered a similar issue and how you managed to resolve it. Thanks!

@DarkAlchy
Copy link

Can't be trained on 24GB as the code is just highly unoptimized to allow it. 2xA6000 will do it with ease. One can, maybe.

@DarkAlchy
Copy link

Let me add that there is this PR to allow it - #162

@lyramakesmusic
Copy link
Contributor

Can't be trained on 24GB as the code is just highly unoptimized to allow it. 2xA6000 will do it with ease. One can, maybe.

one can for sure. vram use is ~27-30gb (dependent on batch size and sample length- bs1 10s was 27.6, bs4 47s was 30 ish), a single a6000 has 48

@DarkAlchy
Copy link

Excuse you, but you just said what I said. 24GB can't train it, and why the PR so we can. Why are you disagreeing with me then with the rest you confirmed what I was saying? Too much eggnog per chance?

@lyramakesmusic
Copy link
Contributor

Why are you disagreeing with me

one a6000

I'm confirming not disagreeing

@DarkAlchy
Copy link

My point was not about 6000 it was about the fact 24GB can't then you rolled in and said one can. It was just the way you phrased it leaving out that "one A6000 can". No biggie, but do realize each BS is a song so the more BS you can give it the better it learns for generality. Hence, 2xA6000 being the sweet spot and anything past that is overkill. Even one H100 at 80GB would be it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants