Tracking issue for SD ecosystem feature parity #69

Keavon · 2023-06-05T00:20:08Z

The intention for this issue is to provide a comprehensive outline of all the core features and capabilities other distributions of Stable Diffusion (primarily A1111) provide. It's a big list, but not all are nearly as high priority as others. Some items in this outline will be turned into GitHub issues for discussing and tracking progress on implementation. Please comment on this issue to suggest additions, clarifications, and sub-features and I'll aim to keep the outline up to date.

Generation methods

Txt2img
Img2img
- In/outpainting
  - Choice of starting with existing image, smeared surrounding colors, latent noise, and latent nothing
- Denoising strength (this is already implemented?)
Depth2Img (via txt2img and img2img)
Regional prompts/latent couple/two shot diffusion (a unique prompt per grid area, like the left half and right half of the image)

Generation parameters

Viewing the image generation progress as it runs (this is very high priority for Graphite)
Negative prompts
CFG scale (is this already implemented?)
Non-square multiple-of-64 resolutions
- Widths and heights as multiples of 8 instead of 64
Infinite prompt token length
Multiple prompts (like space ship (sci-fi) vs. space AND ship (sailing ship in space))
Prompt token weighting (like (beautiful:1.5) tree (with autumn leaves:0.8))
Seed resize (pin a seed and its resolution, then generate at a different resolution or aspect ratio and keep mostly the same image)

Model support

Stylization

ControlNet

Some features are described at https://github.com/Mikubill/sd-webui-controlnet but I don't currently have time to make a list of them. Help with such a list would be appreciated.

This method of promptless inpainting: https://www.reddit.com/r/StableDiffusion/comments/13w28bi/controlnet_and_a1111_devs_discussing_new_inpaint/

Optimizations

VRAM reduction strategies, things like xformers and floating point precision? I don't understand this stuff enough to really get it. Also other methods will remove certain parts of the pipeline from VRAM after that stage has been completed which trades time for VRAM requirements. I'll need help creating a list of out this.

Upscaling

Some upscalers are entirely separate models and are thus likely out of scope. Other upscalers, I think, are part of the SD pipeline. Some are scripts, but I think others are actual models which require being implemented in the actual pipeline? Those ones should probably be included here, but I need help creating a list.

Sampling methods

Other models

Upscaling (ESRGAN, etc.)
CLIP interrogator
Restore faces (GFPGAN, CodeFormer)
(probably more?)

Did I miss something? Probably! Hopefully the community can help me keep this list updated so it's as comprehensive as possible. Thanks ❤️.

Ideally these capabilities would be modular, allowing for composability and opting in and out of specific features at will for any desired image generation pipeline. In our use case with Graphite, we want to put different options into nodes within a node graph so they are user-configurable. (I should also mention that keeping the MIT/Apache 2.0 license is important for Graphite, since our project is also Apache 2.0, so I'd humbly request that some care be taken to not copy from copyleft code which would force this library to change its license, thanks 😃).

The text was updated successfully, but these errors were encountered:

Njasa2k · 2023-06-10T01:09:01Z

katopz · 2023-06-16T06:15:40Z

Nice! fyi ControlNet canny is supported here.

LaurentMazare · 2023-06-17T10:24:41Z

[ ] Viewing the image generation progress as it runs (this is very high priority for Graphite)

@Keavon do you mean that you would want the intermediary images to be available or something else? For the intermediary images, this should already be doable (and available in the command line examples), see for example this snippet.

Keavon · 2023-06-17T15:30:42Z

Yes, viewing the intermediary images while waiting for the final image to be completed. Good to know that's already supported, thanks! Feel free to check off any others that are in my list and already supported, too. Thank you!

LaurentMazare · 2023-08-22T08:02:56Z

Just to mention that I didn't have the time to do much on all these features, one thing though that I've been working on is a new ML framework written in rust called candle.
As an example this includes stable diffusion 1.5 and 2.1 but only text2img at the moment, if there is some interest I can add more there and make a full crate out of this example. The main upside compared to this crate is that there is no dependency on libtorch anymore so deployment is a lot faster, it could run on wasm etc (main downside is that it might not be as optimized as the libtorch version yet but we're working on it).

Keavon · 2023-08-28T00:53:07Z

I was looking at both Candle as well as Burn (which has recently had @Gadersd port both SD 1.4 in Burn and SDXL in Burn) and it definitely looks like one of those frameworks is the path forward in the Rust ecosystem (although I'm curious what their differences are).

I'd really love to help organize interested contributors into a team building a robust, production-ready, pure-Rust Stable Diffusion distro that aims to be as fully-featured as AUTOMATIC1111. I wonder if you have any thoughts or suggestions about that, @LaurentMazare. Likewise @Gadersd was interested in the idea but I should reach out again about next steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue for SD ecosystem feature parity #69

Tracking issue for SD ecosystem feature parity #69

Keavon commented Jun 5, 2023 •

edited

Loading

Njasa2k commented Jun 10, 2023 •

edited

Loading

katopz commented Jun 16, 2023

LaurentMazare commented Jun 17, 2023

Keavon commented Jun 17, 2023

LaurentMazare commented Aug 22, 2023

Keavon commented Aug 28, 2023

Tracking issue for SD ecosystem feature parity #69

Tracking issue for SD ecosystem feature parity #69

Comments

Keavon commented Jun 5, 2023 • edited Loading

Generation methods

Generation parameters

Model support

Stylization

ControlNet

Optimizations

Upscaling

Sampling methods

Other models

Njasa2k commented Jun 10, 2023 • edited Loading

Control Type

Preprocessors

katopz commented Jun 16, 2023

LaurentMazare commented Jun 17, 2023

Keavon commented Jun 17, 2023

LaurentMazare commented Aug 22, 2023

Keavon commented Aug 28, 2023

Keavon commented Jun 5, 2023 •

edited

Loading

Njasa2k commented Jun 10, 2023 •

edited

Loading