what just happend? Speed it's amazing #1702
Replies: 1 comment
-
Nothing changed recently that would result in a speed bump, so maybe you were still using an early version when Flux was beginning to be supported, although there hasn't really been any speed improvements since then either. So it could be a bug that was affecting you, or they're actually giving you the full GPU time now. Make sure to set GPU weights slider to a good number. Anything that allows at least ~2GB of free VRAM for inference processing, although I'm sure that with 40GB of VRAM it already sets the usage high enough to load the whole model at once. But could be that the colab upgraded their hard drives too, because Forge constantly swaps to and from memory even if you do have enough VRAM, so the initial loading of the model would be slower on a lower SYSRAM, and if the system has lower system RAM for swapping models and VAE and text encoders, it would add a bunch of time. So it could just be that the colab got upgraded to more system RAM or faster drives or they weren't giving you the full processing power. If you want potentially even faster inference, and don't want NF4 quality losses, go with the GGUF Q8 version of the model and GGUF Q8 version of the text encoder. It'll deliver nearly the same exact output at half the total size. But I don't know if the overhead GGUF requires would make it faster or not if you already have enough VRAM. I just know that for me it's only ~25% slower than NF4 and almost twice as fast as the original model, so you could potentially get it down to 10-15 seconds per generation. |
Beta Was this translation helpful? Give feedback.
-
I've been using forge with flux dev (20gb original dev) for the last month in colab with an A100 and usually it took 1+ minute per image, and using the same colab notebook I just noticed a huge speed improvement now generations takes 20-30 seconds, that's a 2X or 2.5X improvement! from yesterday! it's amazing! !!!
Beta Was this translation helpful? Give feedback.
All reactions