-
Notifications
You must be signed in to change notification settings - Fork 390
vae tiling improvements: encoding support and adaptative overlap #484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
9edc59f
to
af0491b
Compare
6f49af0
to
6f96903
Compare
Nice! It would also be interesting to see larger tile sizes. Depending on VAE, larger tiles would still fit into the memory budget and would speed things up. |
5495c93
to
e9a8d17
Compare
e9a8d17
to
d103632
Compare
I am not familiar with AI programming, and this is a machine translation, so please ignore me if I say something strange. The However, the brightness issue (issue #588) seems to remain. |
I'm sorry, I don't quite understand what you mean.
You mean this PR doesn't fix it for you? If so this is strange, I can't reproduce the issue with this PR...
As you can see the image made with the master branch is much brighter than the one with this PR. |
Umm.. Sometimes the bottom of the generated image is grayed out; this did not happen when I initialized
I think I have misled you, but there is no doubt that this PR is very good, as you have demonstrated in your examples. |
Hmm that's really weird. I don't see how this could happen.
The result with tiled VAE is always going to be slightly lower quality than the full VAE, because it's lacking information about the whole image. (That's especially true with SD1.x/SD2.x models, because the VAE is quirky ). When using TAE (or with SDXL/SD3 or FLUX), the results are almost completely identical with the PR, while on master, there is the "overexposure" issue on the tiled TAE.
(there is a lot of difference here, mostly because of the kl-f8 vae doesn't handle tiling very well)
(these are almost identical however)
(no tiling) |
I can't vouch for the new code being 100% correct, not knowing the code base, but the approach looks reasonable (but does it have to match how the upscaler model was trained, and if so, does it match? I do not know enough for that). But I can definitely vouch for the old code being incorrect. It assumes that tile overlap is equal in x and y direction, and that's unlikely assuming "somewhat arbitrary" total sizes and a fixed tile size. And I can definitely confirm that #666 is fixed by this change - all five test cases pass (up from only the base case), and the more complex case I discovered this with appears to be fixed as well. Now the difference pictures from pre- and post-upscale pictures (after scaling both back to the same size) look like an edge detector without any noticeable artifacts/biases - precisely what's expected. |
* refactor tile number calculation * support non-square tiles * add env var to change tile overlap * add safeguards and better error messages for SD_TILE_OVERLAP * add safeguards and include overlapping factor for SD_TILE_SIZE * avoid rounding issues when specifying SD_TILE_SIZE as a factor * lower SD_TILE_OVERLAP limit * zero-init empty output buffer
SD_TILE_SIZE
environment variable. (tile size for vae encoding will be about 1.3x bigger, for the same memory requirement)With these changes, vae encoding can now be tiled, and a small (unreported?) issue with the upscaler (or tae) over-brightening some parts of the image is now fixed. This could also be a step toward tiled diffusion support.
Fixes #588, #564 and #666