B4M: Breaking Low-Rank Adapter for Making Content-Style Customization
Yu Xu1,2, Fan Tang1, Juan Cao1, Yuxin Zhang3, Oliver Deussen4, Weiming Dong3, Jintao Li1, Tong-Yee Lee5
1Institute of Computing Technology, Chinese Academy of Sciences, 2University of Chinese Academy of Sciences, 3 Institute of Automation, Chinese Academy of Sciences, 4University of Konstanz, 5National Cheng Kung University
Abstract:
Personalized generation paradigms empower designers to customize visual intellectual properties with the help of textual descriptions by adapting pre-trained text-to-image models on a few images. Recent studies focus on simultaneously customizing content and detailed visual style in images but often struggle with entangling the two. In this study, we reconsider the customization of content and style concepts from the perspective of parameter space construction. Unlike existing methods that utilize a shared parameter space for content and style learning, we propose a novel framework that separates the parameter space to facilitate individual learning of content and style by introducing "partly learnable projection" (PLP) matrices to separate the original adapters into divided sub-parameter spaces. A "break-for-make" customization learning pipeline based on PLP is proposed: we first "break" the original adapters into "up projection" and "down projection" for content and style concept under orthogonal prior and then "make" the entity parameter space by reconstructing the content and style PLPs matrices by using Riemannian precondition to adaptively balance content and style learning. Experiments on various styles, including textures, materials, and artistic style, show that our method outperforms state-of-the-art single/multiple concept learning pipelines regarding content-style-prompt alignment.
Our code is built on Huggingface Diffusers (0.22.0), please follow sdxl for environment setup.
First clone this repo, and then
cd B4M
pip install -e .
The training process include two stages.
-
Train the content model
Run the following script:bash code/train_content.sh
-
Train the style model
Run the following script:bash code/train_style.sh
Note: In both scripts, please replace any dataset paths, output directories, and other file paths with your own.
After completing the first stage, run the following script to start the second-stage training:
bash code/train_second_stage.sh
As before, make sure to update the paths in the script to fit your environment.
After training is complete, you can run inference using:
python infer.py
Make sure to configure the model path and input settings inside the script as needed.
We provide LoRA checkpoints that correspond to the examples presented in the paper. You can download them here:
Download LoRA Checkpoints from Google Drive
The corresponding reference images, prompts, and checkpoints are as follows:
Content Reference | Style Reference | Prompt | LoRA Checkpoint |
---|---|---|---|
teddybear.jpg | paper.jpg | "an image of snq teddybear made from paper cutout art style" | teddybear_paper |
teddybear.jpg | yarn.jpg | "an image of snq teddybear in w@z yarn art style" | teddybear_yarn |
dog_1.jpg | sticker.jpg | "a snq dog in w@z sticker style" | dog_1_sticker |
We are continuously uploading additional model checkpoints. Please stay tuned.
If you make use of our work, please cite our paper:
@article{xu2025b4m,
title={B4M: Breaking Low-Rank Adapter for Making Content-Style Customization},
author={Xu, Yu and Tang, Fan and Cao, Juan and Zhang, Yuxin and Deussen, Oliver and Dong, Weiming and Li, Jintao and Lee, Tong-Yee},
journal={ACM Transactions on Graphics},
volume={44},
number={2},
pages={1--17},
year={2025},
publisher={ACM New York, NY},
doi={10.1145/3728461}
}