-
Notifications
You must be signed in to change notification settings - Fork 333
CUDA out of memory #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can't be trained on 24GB as the code is just highly unoptimized to allow it. 2xA6000 will do it with ease. One can, maybe. |
Let me add that there is this PR to allow it - #162 |
one can for sure. vram use is ~27-30gb (dependent on batch size and sample length- bs1 10s was 27.6, bs4 47s was 30 ish), a single a6000 has 48 |
Excuse you, but you just said what I said. 24GB can't train it, and why the PR so we can. Why are you disagreeing with me then with the rest you confirmed what I was saying? Too much eggnog per chance? |
one a6000 I'm confirming not disagreeing |
My point was not about 6000 it was about the fact 24GB can't then you rolled in and said one can. It was just the way you phrased it leaving out that "one A6000 can". No biggie, but do realize each BS is a song so the more BS you can give it the better it learns for generality. Hence, 2xA6000 being the sweet spot and anything past that is overkill. Even one H100 at 80GB would be it. |
Hello, if 2 NVIDIA GPUs of 24GB each are used, can the model be finetuned? @EddyPianist @DarkAlchy @lyramakesmusic |
That would be 48GB in total, and you would need Linux and Accelerate to link them. |
Thank you for your answering, I'll try 'Accelerate'!
I wonder if the code support using 2 GPUs of 24GB each to finetune the model? And if |
I never had more than one card, and it was a nightmare. Supposedly things have drastically changed. Wish I could help with specifics. |
There was discussion on this in the Discord recently, I was wondering the same. SAT uses data, not model parallelism: https://discord.com/channels/1001555636569509948/1162090696220606525/1355825741589118976 |
Hi there,
I'm encountering a CUDA out of memory error while fine-tuning the stable-audio-open model, even with the batch size set to 1. I'm using an NVIDIA A10 GPU with 24 GB of memory, which I believe should be sufficient for this task.
I’m wondering if anyone else has encountered a similar issue and how you managed to resolve it. Thanks!
The text was updated successfully, but these errors were encountered: