CUDA memory error with --precision full --no-half #298

Ganory · 2022-09-06T16:42:06Z

Ganory
Sep 6, 2022

I have a 4GB VRAM Nvidia card, and I am running this with the "optimized=True" option.
When running without the --precision full --no-half arguments, there aren't any memory problems, but the resulting image is a green square. Adding the arguments "--precision full --no-half" to additional_arguments = "" in relauncher.py causes the CUDA memory error to immediately appear when trying to generate an image from a prompt. The error appears even when trying to generate a 64x64 image.

Any ideas how to solve it? I would really like to use this branch, and explore the integration with the upscaling tools

Interestingly, with the Optimized Stable Diffusion repo, https://github.com/basujindal/stable-diffusion, I can generate images using the optimized script without any memory problems, with the "--precision full" argument. (seems there isn't a "--no-half" argument in that version).

EliEron · 2022-09-06T17:29:20Z

EliEron
Sep 6, 2022

I assume you have a 16xx series card? The green square issue is well known for those when running in half precision. Running in full precision does increase VRAM usage and Stable-Diffusion is VRAM heavy to begin with, so it's not surprising you are running into VRAM issues.

There are two optimized modes available in this repo that should help you, though it's worth nothing that the repo you link to has a pretty recent optimization PR that has not been merged here yet. And there is a second optimization which has been merged into neither repository but might have an even bigger effect. Which is to say that things will only get better for you as these optimizations are merged.

To enable the optimization mode open the "relauncher.py" file found under the script folder and change either optimized or optimized-turbo from False to True, those values are case sensitive, and you shouldn't enable both at once.

The optimized mode uses the least VRAM but sacrifices a lot of speed for it. The optimized-turbo mode uses more VRAM but has a pretty minor speed penalty in comparison.

You will likely have to use the plain optimized mode for now. But once those PRs I mentioned above are merged you should be able to switch over to optimized-turbo. You might even be able to run without either of the modes active.

1 reply

Ganory Sep 6, 2022
Author

Yes, I've got the 1650 card. I am already running with optimized=True. I will try to merge myself the content from the two PRs. I have found the first, but not the second one (which hasn't been merged), where do I find it?

eliovi · 2022-09-06T20:57:49Z

eliovi
Sep 6, 2022

I had the same problem.

I was able to partially solve it - still not completely sure why it works, but if you move the model to cpu ("model.cpu") after creating the sampler but before calling "process_images" function, it will not throw OOM. Note that you need to move it back afterwards (model.cuda).

This does not work in turbo mode (I'm getting errors for inconsistent devices, trying to figure out why), but it does allow me to run on 4 GB device (of the 16xx series, in full precision optimized mode)

After merging neonsecret's fork (#262) I'm able to generate batch size =3 of 512*512 images on my poor 4 GB memory, which is really cool.

2 replies

Ganory Sep 6, 2022
Author

I am not yet too familiar with the code. In which script do I have to make the changes?

eliovi Sep 6, 2022

In scripts/webui.py, inside txt2img function, add
model.cpu() on line 1139, and model.cuda() on line 1171

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA memory error with --precision full --no-half #298

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

CUDA memory error with --precision full --no-half #298

Ganory Sep 6, 2022

Replies: 2 comments · 3 replies

EliEron Sep 6, 2022

Ganory Sep 6, 2022 Author

eliovi Sep 6, 2022

Ganory Sep 6, 2022 Author

eliovi Sep 6, 2022

Ganory
Sep 6, 2022

Replies: 2 comments 3 replies

EliEron
Sep 6, 2022

Ganory Sep 6, 2022
Author

eliovi
Sep 6, 2022

Ganory Sep 6, 2022
Author