Control

Control Overview

screenshot-processors

Native control module for SD.Next for Diffusers backend
Can be used for Control generation as well as Image and Text workflows

For a guide on the options and settings, as well as explanations for the controls themselves, see the Control Guide page.

Supported Control Models

lllyasviel ControlNet for SD 1.5 and SD-XL models
Includes ControlNets as well as Reference-only mode and any compatible 3rd party models
Original ControlNets for SD15 are 1.4GB each and for SDXL its at massive 4.9GB
VisLearn ControlNet XS for SD-XL models
Lightweight ControlNet models for SDXL at 165MB only with near-identical results
TencentARC T2I-Adapter for SD 1.5 and SD-XL models
T2I-Adapters provide similar functionality at much lower resource cost at only 300MB each
Kohya Control LLite for SD-XL models
LLLite models for SDXL at 46MB only provide lightweight image control
TenecentAILab IP-Adapter for SD 1.5 and SD-XL models
IP-Adapters provides great style transfer functionality at much lower resource cost at below 100MB for SD15 and 700MB for SDXL
IP-Adapters can be combined with ControlNet for more stable results, especially when doing batch/video processing
CiaraRowles TemporalNet for SD 1.5 models
ControlNet model designed to enhance temporal consistency and reduce flickering for batch/video processing

All built-in models are downloaded upon first use and stored stored in:
/models/controlnet, /models/adapter, /models/xs, /models/lite, /models/processor

Listed below are all models that are supported out-of-the-box:

ControlNet

SD15:
Canny, Depth, IP2P, LineArt, LineArt Anime, MLDS, NormalBae, OpenPose,
Scribble, Segment, Shuffle, SoftEdge, TemporalNet, HED, Tile
SDXL:
Canny Small XL, Canny Mid XL, Canny XL, Depth Zoe XL, Depth Mid XL

Note: only models compatible with currently loaded base model are listed
Additional ControlNet models in safetensors can be downloaded manually and placed into corresponding folder: /models/control/controlnet

ControlNet XS

SDXL:
Canny, Depth

ControlNet LLLite

SDXL:
Canny, Canny anime, Depth anime, Blur anime, Pose anime, Replicate anime

Note: control-lllite is implemented using unofficial implementation and its considered experimental
Additional ControlNet models in safetensors can be downloaded manually and placed into corresponding folder: /models/control/lite

T2I-Adapter

'Segment': 'TencentARC/t2iadapter_seg_sd14v1',
'Zoe Depth': 'TencentARC/t2iadapter_zoedepth_sd15v1',
'OpenPose': 'TencentARC/t2iadapter_openpose_sd14v1',
'KeyPose': 'TencentARC/t2iadapter_keypose_sd14v1',
'Color': 'TencentARC/t2iadapter_color_sd14v1',
'Depth v1': 'TencentARC/t2iadapter_depth_sd14v1',
'Depth v2': 'TencentARC/t2iadapter_depth_sd15v2',
'Canny v1': 'TencentARC/t2iadapter_canny_sd14v1',
'Canny v2': 'TencentARC/t2iadapter_canny_sd15v2',
'Sketch v1': 'TencentARC/t2iadapter_sketch_sd14v1',
'Sketch v2': 'TencentARC/t2iadapter_sketch_sd15v2',

SD15:
Segment, Zoe Depth, OpenPose, KeyPose, Color, Depth v1, Depth v2, Canny v1, Canny v2, Sketch v1, Sketch v2
SDXL:
Canny XL, Depth Zoe XL, Depth Midas XL, LineArt XL, OpenPose XL, Sketch XL

Note: Only models compatible with currently loaded base model are listed

Processors

Pose style: OpenPose, DWPose, MediaPipe Face
Outline style: Canny, Edge, LineArt Realistic, LineArt Anime, HED, PidiNet
Depth style: Midas Depth Hybrid, Zoe Depth, Leres Depth, Normal Bae
Segmentation style: SegmentAnything
Other: MLSD, Shuffle

Note: Processor sizes can vary from none for built-in ones to anywhere between 200MB up to 4.2GB for ZoeDepth-Large

Segmentation Models

There are 8 Auto-segmentation models available:

Facebook SAM ViT Base (357MB)
Facebook SAM ViT Large (1.16GB)
Facebook SAM ViT Huge (2.56GB)
SlimSAM Uniform (106MB)
SlimSAM Uniform Tiny (37MB)
Rembg Silueta
Rembg U2Net
Rembg ISNet

Reference

Reference mode is its own pipeline, so it cannot have multiple units or processors

Workflows

Inputs & Outputs

Image -> Image
Batch: list of images -> Gallery and/or Video
Folder: folder with images -> Gallery and/or Video
Video -> Gallery and/or Video

Notes:

Input/Output/Preview panels can be minimized by clicking on them
For video output, make sure to set video options

Unit

Unit is: input plus process plus control
Pipeline consists of any number of configured units
If unit is using using control modules, all control modules inside pipeline must be of same type
e.g. ControlNet, ControlNet-XS, T2I-Adapter or Reference
Each unit can use primary input or its own override input
Each unit can have no processor in which case it will run control on input directly
Use when you're using predefined input templates
Unit can have no control in which case it will run processor only
Any combination of input, processor and control is possible
For example, two enabled units with process only will produce compound processed image but without control

What-if?

If no input is provided then pipeline will run in txt2img mode
Can be freely used instead of standard txt2img
If none of units have control or adapter, pipeline will run in img2img mode using input image
Can be freely used instead of standard img2img
If you have processor enabled, but no controlnet or adapter loaded,
pipeline will run in img2img mode using processed input
If you have multiple processors enabled, but no controlnet or adapter loaded,
pipeline will run in img2img mode on blended processed image
Output resolution is by default set to input resolution,
Use resize settings to force any resolution
Resize operation can run before (on input image) or after processing (on output image)
Using video input will run pipeline on each frame unless skip frames is set
Video output is standard list of images (gallery) and can be optionally encoded into a video file
Video file can be interpolated using RIFE for smoother playback

Overrides

Control can be based on main input or each individual unit can have its own override input
By default, control runs in default control+txt2img mode
If init image is provided, it runs in control+img2img mode
Init image can be same as control image or separate
IP adapter can be applied to any workflow
IP adapter can use same input as control input or separate

Inpaint

Inpaint workflow is triggered when input image is provided in inpaint mode
Inpaint mode can be used with image-to-image or controlnet workflows
Other unit types such as T2I, XS or Lite do not support inpaint mode

Outpaint

Outpaint workflow is triggered when input image is provided in outpaint mode
Outpaint mode can be used with image-to-image or controlnet workflows
Other unit types such as T2I, XS or Lite do not support outpaint mode
Recommendation is to increase denoising strength to at least 0.8 since outpained area is blank and needs to be filled with noise
Outpaint folloing input image can be controled by overlap setting - higher overlap and more of original image will be part of the outpaint process

Logging

To enable extra logging for troubleshooting purposes,
set environment variables before running SD.Next

Linux:

export SD_CONTROL_DEBUG=true
export SD_PROCESS_DEBUG=true
./webui.sh --debug
Windows:

set SD_CONTROL_DEBUG=true
set SD_PROCESS_DEBUG=true
webui.bat --debug

Note: Starting with debug info enabled also enables Test mode in Control module

Limitations / TODO

Known issues

Using model offload can cause Control models to be on the wrong device at the time of the execution
Example error message:

Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

Workaround: Disable model offload in settings -> diffusers and use move model option instead
Issues after trying to use DWPose and installation fails: `` error.
Example error message:

Control processor DWPose: DLL load failed while importing _ext

Workaround: Activate venv and run following commands to install dwpose dependencies manually:
pip install --upgrade --no-deps --force-reinstall openmim==0.3.9 mmengine==0.10.4 mmcv==2.1.0 mmpose==1.3.1 mmdet==3.3.0

Future

Pose editor
Process multiple images in batch in parallel
ControlLora https://huggingface.co/stabilityai/control-lora
Multi-frame rendering https://xanthius.itch.io/multi-frame-rendering-for-stablediffusion
Deflickering and deghosting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly