PhotoDoodle

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data
Huang Shijie, Yiren Song, Yuxuan Zhang, Hailong Guo, Xueyin Wang, and Mike Zheng Shou, Liu Jiaming
Show Lab, National University of Singapore
Tiamat AI

Community Resources

smthemex/ComfyUI_PhotoDoodle: Intergrating PhotoDoodle into Comfyui nodes, 12 GB GPU memory required for inferencing.

ameerazam08/PhotoDoodle-Image-Edit-GPU: PhotoDoodle deployment on huggingface space.

Quick Start

Configuration

1. Environment setup

git clone [email protected]:showlab/PhotoDoodle.git
cd PhotoDoodle

conda create -n doodle python=3.11.10
conda activate doodle

2. Requirements installation

pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install --upgrade -r requirements.txt

2. Inference

We provided the intergration of diffusers pipeline with our model and uploaded the model weights to huggingface, it's easy to use the our model as example below:

from src.pipeline_pe_clone import FluxPipeline
import torch
from PIL import Image

pretrained_model_name_or_path = "black-forest-labs/FLUX.1-dev"
pipeline = FluxPipeline.from_pretrained(
    pretrained_model_name_or_path,
    torch_dtype=torch.bfloat16,
).to('cuda')

pipeline.load_lora_weights("nicolaus-huang/PhotoDoodle", weight_name="pretrain.safetensors")
pipeline.fuse_lora()
pipeline.unload_lora_weights()

pipeline.load_lora_weights("nicolaus-huang/PhotoDoodle", weight_name="sksmagiceffects.safetensors")

height=768
width=512

validation_image = "assets/1.png"
validation_prompt = "add a halo and wings for the cat by sksmagiceffects"
condition_image = Image.open(validation_image).resize((height, width)).convert("RGB")

result = pipeline(prompt=validation_prompt, 
                  condition_image=condition_image,
                  height=height,
                  width=width,
                  guidance_scale=3.5,
                  num_inference_steps=20,
                  max_sequence_length=512).images[0]

result.save("output.png")

or simply run the inference script:

python inference.py

3. Weights

You can download the trained checkpoints of PhotoDoodle for inference. Below are the details of available models, checkpoint name are also trigger words.

You would need to load and fuse the pretrained checkpoints model in order to load the other models.

Model	Description	Resolution
pretrained	PhotoDoodle model trained on `SeedEdit` dataset	768, 768
sksmonstercalledlulu	PhotoDoodle model trained on `Cartoon monster` dataset	768, 512
sksmagiceffects	PhotoDoodle model trained on `3D effects` dataset	768, 512
skspaintingeffects	PhotoDoodle model trained on `Flowing color blocks` dataset	768, 512
sksedgeeffect	PhotoDoodle model trained on `Hand-drawn outline` dataset	768, 512

4. Dataset

2.1 Settings for dataset

The training process uses a paired dataset stored in a .jsonl file, where each entry contains image file paths and corresponding text descriptions. Each entry includes the source image path, the target (modified) image path, and a caption describing the modification.

Example format:

{"source": "path/to/source.jpg", "target": "path/to/modified.jpg", "caption": "Instruction of modifications"}
{"source": "path/to/source2.jpg", "target": "path/to/modified2.jpg", "caption": "Another instruction"}

We have uploaded our datasets to Hugging Face.

5. Results

6. Acknowledgments

Thanks to Yuxuan Zhang and Hailong Guo for providing the code base.
Thanks to Diffusers for the open-source project.
Thanks to AMEERAZAM08 for contributing the huggingface space demo.
Thanks to smthemex for contributing the comfyui intergration.

Citation

@misc{huang2025photodoodlelearningartisticimage,
      title={PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data}, 
      author={Shijie Huang and Yiren Song and Yuxuan Zhang and Hailong Guo and Xueyin Wang and Mike Zheng Shou and Jiaming Liu},
      year={2025},
      eprint={2502.14397},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.14397}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
inference.py		inference.py
merge.py		merge.py
requirements.txt		requirements.txt
requirements_hugggingface_version.txt		requirements_hugggingface_version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhotoDoodle

Community Resources

Quick Start

Configuration

1. Environment setup

2. Requirements installation

2. Inference

3. Weights

4. Dataset

2.1 Settings for dataset

5. Results

6. Acknowledgments

Citation

About

Releases

Packages

Languages

License

Hadryan/PhotoDoodle

Folders and files

Latest commit

History

Repository files navigation

PhotoDoodle

Community Resources

Quick Start

Configuration

1. Environment setup

2. Requirements installation

2. Inference

3. Weights

4. Dataset

2.1 Settings for dataset

5. Results

6. Acknowledgments

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages