LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

"LCM-LoRA: A Universal Stable-Diffusion Acceleration Module" Arxiv, 2023 Nov paper code pdf Authors: Simian Luo, Yiqin Tan, Suraj Patil, Daniel Gu, Patrick von Platen, Apolinário Passos, Longbo Huang, Jian Li, Hang Zhao

Key-point

Task: Diffusion 加速
现有问题：Diffusion 推理太慢
🏷️ Label:

类似 DDIM 等 PF-ODE solver, LCM-LoRA 可以看作 plug-in neural PF-ODE solver

Contributions

提出 universal training-free acceleration module that can be directly plugged into various Stable-Diffusion (SD) 实现加速

Related Work

❓ classifier-free guidance ?

DDIM?

CM?

LoRA?

PF-ODE? Probability flow ODE与DDIM

问题：针对 Diffusion 推理太慢；

Efforts to accelerate LDMs generally fall into two categories: ODE-Solvers ；distillation of LDM

PF-ODE Solvers

例如 DDIM 等方法，目的减少推理步数。计算仍然很大，尤其当使用 classifier-free guidance 时候
蒸馏方法

Guided-Distill 计算量太大。仍然需要一种方法平衡速度 & 生成质量

Consistency Models

TODO

Latent Consistency Models (LCMs)

"Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference" paper website

将反向去噪视为 augmented probability flow ODE (PF-ODE) problem； predict the solution in the latent space, bypassing the need for iterative solutions through numerical ODE-Solvers. This results in a remarkably efficient synthesis of high-resolution images, taking only 1 to 4 inference steps；推理效率高 & distillation 只要 32h A100 hour； Motivation：对 LDM 微调，实现在小数据集上能够又 few steps 生成的效果

LCF: fine-tune pre-trained LCMs without starting from the teacher diffusion model; 缺点需要针对数据集 finetune

LCM-LoRA, a universal training-free acceleration module that can be directly plugged into various Stable-Diffusion (SD) ；类比 DDIM，可以看作 neural-based PF-ODE solver

Parameter-Efficient Fine-Tuning

"LoRA: Low-Rank Adaptation of Large Language Models" paper code blog 苏剑林老师 LoRA 博客

调整所有参数的称为 full-finetuning

LoRA 用一个 Low Rank Matrix 表示需要 finetune 的部分模型参数，LORA是一种低资源微调大模型方法，训练参数仅为整体参数的万分之一、GPU显存使用量减少2/3且不会引入额外的推理耗时。

methods

LCD（LCM 论文）理解为对 Stable Diffusion 进行微调的过程

LoRA Distillation for LCM

将需要更新的 $\triangle{W}$ 进行低秩分解，降低需要更新的参数量，降低显存

SD-v1.5 需要微调的参数量从 980M -> 67.5M

$$ h=W_0x+\Delta Wx=W_0x+BAx $$

universal acceleration Moudle

employing LoRA for the distillation process

针对指定数据集微调，实现只合成指定风格图像下，大幅提速生成速度

效果：比直接用 LoRA 速度更快 & 效果更好

R1C5 和 R2C2 很相似，是否可以理解成去噪为 latent-space 的一条路径，只不过 LCM-LoRA 能够直接到关键节点？

从向量的角度融合，acceleration vector 和 Full-finetune 向量加权融合一下得到一个 trade-off 质量和速度的向量

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2023_11_Arxiv_LCM-LoRA--A-Universal-Stable-Diffusion-Acceleration-Module_Note.md

2023_11_Arxiv_LCM-LoRA--A-Universal-Stable-Diffusion-Acceleration-Module_Note.md

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Key-point

Contributions

Related Work

Consistency Models

Latent Consistency Models (LCMs)

Parameter-Efficient Fine-Tuning

methods

LoRA Distillation for LCM

universal acceleration Moudle

Experiment

Limitations

Summary 🌟

Files

2023_11_Arxiv_LCM-LoRA--A-Universal-Stable-Diffusion-Acceleration-Module_Note.md

Latest commit

History

2023_11_Arxiv_LCM-LoRA--A-Universal-Stable-Diffusion-Acceleration-Module_Note.md

File metadata and controls

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Key-point

Contributions

Related Work

Consistency Models

Latent Consistency Models (LCMs)

Parameter-Efficient Fine-Tuning

methods

LoRA Distillation for LCM

universal acceleration Moudle

Experiment

Limitations

Summary 🌟