ComfyUI_StoryDiffusion

Using different ID migration methods to make storys in ComfyUI

Origin methods from

StoryDiffusion
MS-Diffusion,
StoryMaker,
Consistory,
Kolor,
Pulid,
Flux,
photomaker,
IP-Adapter,
InfiniteYou,
UNO,
RealCustom,
InstantCharacter,
DreamO
Bagel
OmniConsistency

Updates:

2025/06/25
add dreamo v1.1 support，新增dreamo v1.1模型支持，下载对应的sft和dpo lora，不要改名，放在lora文件夹即可。

新增OmniConsistency 单体unet fp8 以及( gguf 和svdq,虽然支持，但是lora不支持，无法复现，不推荐 )的支持,没repo快
OmniConsistency 并不是ID的迁移，移植过来是方便使用常规flux diffuser加载，内置多种量化方式（还未完善，目前只支持repo加载），12G以下用nf4就好（1024*1024 在12G 50秒一张图），
新增Bagel模型的支持，支持int8和nf4量化（官方用的十字鱼佬的PR）输入图片则是edit模式，不输入就是文生图，在量化nf4的情况下，显存峰值大约7G，实际跑4G多,edit的编辑能力,在nf4条件下一般；
DreamO的方法ip id style方法实现，双人同框使用ip+ip，默认都是ip模式。带人脸的可以用ip，也可以用id（可以不连入衣服），pos 和neg lora在lora的目录下时默认开启，如果没有就是3 lora模式。开启id和style模式，需要在extra 输入id或 style
新增2个ID迁移的方法实现，分别是RealCustom（SDXL）和InstantCharacter（FLUX），基准测试在4070 12G，二个方法的速度都很慢，InstantCharacter支持多种量化，如果使用双截棍量化加速很快，但是没意义，因为IP层没加载进去，具体看示例图和新的工作流文件，RealCustom需要6个单体模型，InstantCharacter需要2个repo形式的clip_vison(暂时没空改)，16G以上显存会好点
利用uno的功能来实现flux流程的双角色同框，prompt示例见图；
修复ms-diffusion的双角色提示词错误，使用ms diffusion 角色提示词应该是 [A] a (man)... ,[B] a (woman)...,场景提示词不用改，还是[A] ...[B]...在同一句里时开启；
Use the function of UNO to realize the dual roles of the FLUX process in the same frame, the prompt example is shown in the figure;
Fixed the error of the dual role prompt words of ms-diffusion, the role prompts of ms diffusion should be [A] a (man)... ,[B] a (woman)..., the scene prompts do not need to be changed, or [A] ... [B]... in the same sentence;
Add UNO support，Only the single FLUX model (27G) and UNO's Lora are needed. Please enable FP8 quantization and use storydiffusionw_flowjson workflow testing ，fix a bug，
新增UNO支持，只需要单体FLUX模型(27G)和UNO的lora，请开启fp8量化和使用storydiffusion_workflow.json工作流测试,修复tokens过长的bug;
Add infinite svdq v0.2 support,it'work well when your svdq update v0.2，download wheel 更新 svdq v0.2的支持，infinite工作正常，轮子下载地址。
1.修改了模型加载的流程，更新到V2版本，如果你喜欢旧的，可以下载V1.0版本的,2.请使用storydiffusion_workflow.json，它集成了主要的工作流;3.剔除掉一些过时的功能;
1.Modified the model loading process.Update to V2 version, If you like the old one, you can download version 1.0，2.Please use 'storydiffusion_workflow.json', which integrates the main workflow，3.Remove some outdated features;

1.Installation

In the ./ComfyUI /custom_node directory, run the following:

git clone https://github.com/smthemex/ComfyUI_StoryDiffusion.git

2.requirements

pip install -r requirements.txt

使用story(photomaker V2)，pulid-flux，kolor，story-maker，infiniteyou，时需要安装insightface库。if using story(photomaker V2)，pulid-flux，kolor，story-make，infiniteyou:

pip install insightface

If the module is missing, please pip install，缺什么库就装什么。

3 models

3.1 stroy _diffusion mode （单纯故事）

3.1.1 any sdxl checkpoints 任意SDXL单体模型

├── ComfyUI/models/checkpoints/
|             ├── juggernautXL_v8Rundiffusion.safetensors

3.1.2 如果使用图生图 if image to image 下载 download photomaker-v1.bin or 或者 photomaker-v2.bin

├── ComfyUI/models/photomaker/
|             ├── photomaker-v1.bin or photomaker-v2.bin

3.2 MS-diffusion mode（2 role in 1 imag 双角色同框）

3.2.1下载 download: ms_adapter.bin

├── ComfyUI/models/
|             ├── photomaker/ms_adapter.bin
|             ├── clip_vision/clip_vision_g.safetensors(2.35G) or CLIP-ViT-bigG-14-laion2B-39B-b160k.safetensors(3.43G)

3.2.2 用cn则需要对应的cn的controlnet模型。 if using controlnet in ms-diffusion(Control_img image preprocessing, please use other nodes );

├── ComfyUI/models/controlnet/   
|     ├──xinsir/controlnet-openpose-sdxl-1.0    
|     ├──... 其他类似的

3.3 kolors face mode（不再支持IP，已修复高版本错误）

Kwai-Kolors #不用全下，除了config文件，只需要下载unet和vae模型；下载 download Kolors-IP-Adapter-FaceID-Plus 下载 downloadchatglm3-8bit.safetensors or fp16 下载KJ的单体clip模型；自动下载"DIAMONIK7777/antelopev2"，will auto download "DIAMONIK7777/antelopev2" insightface models....

├── ComfyUI/models
|             ├── /photomaker/ipa-faceid-plus.bin
|             ├── clip/chatglm3-8bit.safetensors
|             ├── clip_vision/clip-vit-large-patch14.safetensors  # Kolors-IP-Adapter-Plus or Kolors-IP-Adapter-FaceID-Plus using same checkpoints.

kolors的repo文件结构

├── any path/Kwai-Kolors/Kolors
|      ├──model_index.json
|      ├──vae
|          ├── config.json
|          ├── diffusion_pytorch_model.safetensors (rename from diffusion_pytorch_model.fp16.safetensors )
|      ├──unet
|          ├── config.json
|          ├── diffusion_pytorch_model.safetensors (rename from diffusion_pytorch_model.fp16.safetensors )
|      ├──tokenizer
|          ├── tokenization_chatglm.py ##新版，修复高版本diffuser错误
|          ├── ... #all 所有文件
|       ├── text_encoder
|          ├── modeling_chatglm.py #新版，修复高版本diffuser错误
|          ├── tokenization_chatglm.py ##新版，修复高版本diffuser错误
|          ├── ... #all 所有文件
|       ├── scheduler
|          ├── scheduler_config.json

3.4 flux_pulid mode .

torch must > 0.24.0 optimum-quanto must >=0.2.4

pip install -U optimum-quanto

下载 downloadEVA02_CLIP_L_336_psz14_s6B.pt and pulid_flux_v0.9.0.safetensors and flux1-dev-fp8.safetensors ，自动下载 auto downlaod DIAMONIK7777/antelopev2

├── ComfyUI/models/
|             ├── photomaker/pulid_flux_v0.9.0.safetensors
|             ├── clip_vision/EVA02_CLIP_L_336_psz14_s6B.pt
|             ├── diffusion_models/flux1-dev-fp8.safetensors
├── ComfyUI/models/clip/
|             ├── t5xxl_fp8_e4m3fn.safetensors
|             ├── clip_l.safetensors

3.5 storymake mode
下载 download mask.bin#可以自动下载 buffalo_l#自动下载 RMBG-1.4#自动下载

├── ComfyUI/models/
|         ├── photomaker/mask.bin
|         ├── clip_vision/clip_vision_H.safetensors  #2.4G base in laion/CLIP-ViT-H-14-laion2B-s32B-b79K
├── ComfyUI/models/buffalo_l/
|         ├── 1k3d68.onnx
|         ├── ...

3.6 InfiniteYou mode

3.6.1 flux transformer repo or kj fp8

├── any_path/FLUX.1-dev/transformer
|          ├── config.json
|          ├──diffusion_pytorch_model-00001-of-00003.safetensors
|          ├──diffusion_pytorch_model-00002-of-00003.safetensors
|          ├──diffusion_pytorch_model-00003-of-00003.safetensors
|          ├── diffusion_pytorch_model.safetensors.index.json

or

├── ComfyUI/models/
|             ├── diffusion_models/flux1-dev-fp8.safetensors #

3.6.2 infinite controlnet from here ,you can use sim_stage1 or aes_stage2,必要模型，repo格式

├── any_path/sim_stage1/
|         ├── image_proj_model.bin
|         ├── InfuseNetModel/
|             ├── diffusion_pytorch_model-00001-of-00002.safetensors
|             ├── diffusion_pytorch_model-00002-of-00002.safetensors
|             ├── diffusion_pytorch_model.safetensors.index.json
|             ├── config.json

or

├── any_path/aes_stage2/
|         ├── ...

3.6.3 lora optional from here
3.6.4 insightface

├── ComfyUI/models/antelopev2/   
|     ├──1k3d68.onnx  
|     ├──...

3.6.5 recognition_arcface_ir_se50.pth from here auto download,which embeded comfyui in "Lib\site-packages\facexlib\weights" dir
3.6.6 if use gguf quatization (optional) download gguf from here,and fill local path in 'easyfunction_lite' node's 'select_method'

├── ComfyUI/models/gguf
|         ├── flux1-dev-Q8_0.gguf  #flux1-dev-Q6_K.gguf

3.6.7 if use svdquant(optional) download svdquant repo from here and fill local path in 'easyfunction_lite' node's 'select_method'

3.7 UNO mode
download lora dit_lora.safetensor,use fp8,if Vram <24.

├── ComfyUI/models/
|             ├── diffusion_models/flux1-dev.safetensors  #
|             ├── loras/dit_lora.safetensors #

3.8 RealCustom mode
download all bytedance-research/RealCustom 可能要连外网

├── ComfyUI/models/
|             ├── diffusion_models/sdxl-unet.bin  #
|             ├── photomaker/RealCustom_highres.pth  #
|             ├── clip/clip_l #normal 常规的不用重复下
|             ├── clip/clip_g # normal 常规的不用重复下
|             ├── clipvison/vit_so400m_patch14_siglip_384.bin #vit_so400m_patch14_siglip_384
|             ├── clipvison/vit_large_patch14_reg4_dinov2.bin #vit_large_patch14_reg4_dinov2.lvd142m

3.9 InstantCharacter mode
download instantcharacter_ip-adapter.bin
repo：google/siglip-so400m-patch14-384 and repo：facebook/dinov2-giant

├── ComfyUI/models/photomaker/instantcharacter_ip-adapter.bin
├──  anypath/google/siglip-so400m-patch14-384
├──  anypath/facebook/dinov2-giant

3.10 DreamO mode
download dreamo
flux repo: flux
ben2 pth :BEN2_Base.pth or auto 或者自动下载
turbo lora：alimama-creative/FLUX.1-Turbo-Alpha

├── ComfyUI/models/loras/
       ├──dreamo_cfg_distill.safetensors
       ├──dreamo.safetensors
       ├──dreamo_quality_lora_neg.safetensors #optional  可选，v1.0 没有也能用，与上两个lora在一个目录即可
       ├──dreamo_quality_lora_pos.safetensors #optional  可选，v1.0 没有也能用，与上两个lora在一个目录即可
       ├──dreamo_dpo_lora.safetensors # optional 可选 v1.1，没有也能用，与上两个lora在一个目录即可
       ├──dreamo_sft_lora.safetensors # optional  可选，v1.1，没有也能用，与上两个lora在一个目录即可
├── ComfyUI/models/photomaker/
       ├──FLUX.1-Turbo-Alpha.safetensors #rename 重命名的turbo lora
├──  anypath/black-forest-labs/FLUX.1-dev
├──  ComfyUI/models/BEN2_Base.pth #or any path

3.11 Bagel mode
download BAGEL-7B-MoT

├── ComfyUI/models/vae/
       ├──ae.safetensors # flux or BAGEL-7B-MoT

├──  Any/path/ByteDance-Seed/BAGEL-7B-MoT/
       ├──all files # 所有文件

3.12 OmniConsistency mode
flux repo: flux
OmniConsistency

├── ComfyUI/models/photomaker/
       ├──OmniConsistency.safetensors # 
├── ComfyUI/models/loras/
       ├── any flux loras

4 Example

4.1 story-diffusion

txt2img 文生图示例

* img2img 图生图示例

4.2 ms-diffusion

txt2img 文生图双角色同框

* img2img 图生图双角色同框

4.3 story-maker or story-and-maker

story-and-maker

* story-maker

4.4 consistory

only one role 只支持单角色 use example.json

4.5 kolor-face

img2img kolor face,图生图

4.6 pulid-flux

注意示例图片的repo模式已取消，使用 example.json的流程

4.7 infiniteyou

repo nf4 注意节点有修改，按example.json的流程

* gguf

* svdq，升级到v.2工作正常

4.8 UNO

dual 双角色同框示例

4.9 RealCustom

4.10 InstantCharacter

4.11 DreamO

nf4

* fp8 unet or int8 and dual roles

* nf4 style

* nf4 id

4.12 Bagel

nf4 image2image

* nf4 txt2image

4.13 OmniConsistency

nf4 image2image

4.13 comfyUI classic（comfyUI经典模式，可以接任意适配CF的流程，主要是方便使用多角色的clip）

any mode SD1.5 SDXL SD3.5 FLUX...

5 Citation

StoryDiffusion

@article{zhou2024storydiffusion,
  title={StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation},
  author={Zhou, Yupeng and Zhou, Daquan and Cheng, Ming-Ming and Feng, Jiashi and Hou, Qibin},
  journal={arXiv preprint arXiv:2405.01434},
  year={2024}
}

IP-Adapter

@article{ye2023ip-adapter,
  title={IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models},
  author={Ye, Hu and Zhang, Jun and Liu, Sibo and Han, Xiao and Yang, Wei},
  booktitle={arXiv preprint arxiv:2308.06721},
  year={2023}
}

MS-Diffusion

@misc{wang2024msdiffusion,
  title={MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance}, 
  author={X. Wang and Siming Fu and Qihan Huang and Wanggui He and Hao Jiang},
  year={2024},
  eprint={2406.07209},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

photomaker

@inproceedings{li2023photomaker,
  title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding},
  author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}

kolors

@article{kolors,
  title={Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis},
  author={Kolors Team},
  journal={arXiv preprint},
  year={2024}
}

PuLID

@article{guo2024pulid,
  title={PuLID: Pure and Lightning ID Customization via Contrastive Alignment},
  author={Guo, Zinan and Wu, Yanze and Chen, Zhuowei and Chen, Lang and He, Qian},
  journal={arXiv preprint arXiv:2404.16022},
  year={2024}
}

Consistory

@article{tewel2024training,
  title={Training-free consistent text-to-image generation},
  author={Tewel, Yoad and Kaduri, Omri and Gal, Rinon and Kasten, Yoni and Wolf, Lior and Chechik, Gal and Atzmon, Yuval},
  journal={ACM Transactions on Graphics (TOG)},
  volume={43},
  number={4},
  pages={1--18},
  year={2024},
  publisher={ACM New York, NY, USA}
}

infiniteyou

@article{jiang2025infiniteyou,
  title={{InfiniteYou}: Flexible Photo Recrafting While Preserving Your Identity},
  author={Jiang, Liming and Yan, Qing and Jia, Yumin and Liu, Zichuan and Kang, Hao and Lu, Xin},
  journal={arXiv preprint},
  volume={arXiv:2503.16418},
  year={2025}
}

svdquant

@inproceedings{
  li2024svdquant,
  title={SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models},
  author={Li*, Muyang and Lin*, Yujun and Zhang*, Zhekai and Cai, Tianle and Li, Xiuyu and Guo, Junxian and Xie, Enze and Meng, Chenlin and Zhu, Jun-Yan and Han, Song},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025}
}

GGUF FLUX LICENSE

@article{wu2025less,
  title={Less-to-More Generalization: Unlocking More Controllability by In-Context Generation},
  author={Wu, Shaojin and Huang, Mengqi and Wu, Wenxu and Cheng, Yufeng and Ding, Fei and He, Qian},
  journal={arXiv preprint arXiv:2504.02160},
  year={2025}
}

@inproceedings{huang2024realcustom,
  title={RealCustom: narrowing real text word for real-time open-domain text-to-image customization},
  author={Huang, Mengqi and Mao, Zhendong and Liu, Mingcong and He, Qian and Zhang, Yongdong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7476--7485},
  year={2024}
}
@article{mao2024realcustom++,
  title={Realcustom++: Representing images as real-word for real-time customization},
  author={Mao, Zhendong and Huang, Mengqi and Ding, Fei and Liu, Mingcong and He, Qian and Zhang, Yongdong},
  journal={arXiv preprint arXiv:2408.09744},
  year={2024}
}
@article{wu2025less,
  title={Less-to-More Generalization: Unlocking More Controllability by In-Context Generation},
  author={Wu, Shaojin and Huang, Mengqi and Wu, Wenxu and Cheng, Yufeng and Ding, Fei and He, Qian},
  journal={arXiv preprint arXiv:2504.02160},
  year={2025}
}

@article{tao2025instantcharacter,
  title={InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework},
  author={Tao, Jiale and Zhang, Yanbing and Wang, Qixun and Cheng, Yiji and Wang, Haofan and Bai, Xu and Zhou, Zhengguang and Li, Ruihuang and Wang, Linqing and Wang, Chunyu and others},
  journal={arXiv preprint arXiv:2504.12395},
  year={2025}
}

DreamO

@article{deng2025bagel,
  title   = {Emerging Properties in Unified Multimodal Pretraining},
  author  = {Deng, Chaorui and Zhu, Deyao and Li, Kunchang and Gou, Chenhui and Li, Feng and Wang, Zeyu and Zhong, Shu and Yu, Weihao and Nie, Xiaonan and Song, Ziang and Shi, Guang and Fan, Haoqi},
  journal = {arXiv preprint arXiv:2505.14683},
  year    = {2025}
}

@inproceedings{Song2025OmniConsistencyLS,
  title={OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data},
  author={Yiren Song and Cheng Liu and Mike Zheng Shou},
  year={2025},
  url={https://api.semanticscholar.org/CorpusID:278905729}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!