PaddlePaddle · Xingyu-Zheng · Dec 27, 2024
diff --git a/example/BinaryDM/README.md b/example/BinaryDM/README.md
@@ -0,0 +1,31 @@
+# BinaryDM in PaddlePaddle
+
+## 1. 简介
+
+本示例介绍了一种权重二值化的扩散模型的训练方法。通过可学习多基二值化器和低秩表示模仿来增强二值扩散模型的表征能力并提高优化表现，能支持将扩散模型应用于极限资源任务场景中。
+
+技术详情见论文 [BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models](https://arxiv.org/pdf/2404.05662v4)
+
+![binarydm](.\imgs\binarydm.png)
+
+## 2.训练
+
+### 2.1 环境准备
+
+- paddlepaddle>=2.0.1 (paddlepaddle-gpu>=2.0.1)
+- visualdl
+- lmdb
+
+### 2.2 启动训练
+
+```
+python main_binarydm.py --config {DATASET}.yml --exp {PROJECT_PATH} --doc {MODEL_NAME} --ni
+```
+
+## 致谢
+
+本实现源于下列开源仓库:
+
+- [https://github.com/Xingyu-Zheng/BinaryDM](https://github.com/Xingyu-Zheng/BinaryDM) (official implementation of BinaryDM).
+- [https://openi.pcl.ac.cn/iMon/ddim-paddle](https://openi.pcl.ac.cn/iMon/ddim-paddle) (PaddlePaddle version for DDIM).
+- [https://github.com/ermongroup/ddim](https://github.com/ermongroup/ddim) (code structure).
diff --git a/example/BinaryDM/configs/bedroom.yml b/example/BinaryDM/configs/bedroom.yml
@@ -0,0 +1,50 @@
+data:
+    dataset: "LSUN"
+    category: "bedroom"
+    image_size: 256
+    channels: 3
+    logit_transform: false
+    uniform_dequantization: false
+    gaussian_dequantization: false
+    random_flip: true
+    rescaled: true
+    num_workers: 32
+
+model:
+    type: "simple"
+    in_channels: 3
+    out_ch: 3
+    ch: 128
+    ch_mult: [1, 1, 2, 2, 4, 4]
+    num_res_blocks: 2
+    attn_resolutions: [16, ]
+    dropout: 0.0
+    var_type: fixedsmall
+    ema_rate: 0.999
+    ema: True
+    resamp_with_conv: True
+
+diffusion:
+    beta_schedule: linear
+    beta_start: 0.0001
+    beta_end: 0.02
+    num_diffusion_timesteps: 1000
+
+training:
+    batch_size: 64
+    n_epochs: 10000
+    n_iters: 5000000
+    snapshot_freq: 5000
+    validation_freq: 2000
+
+sampling:
+    batch_size: 32
+    last_only: True
+
+optim:
+    weight_decay: 0.000
+    optimizer: "Adam"
+    lr: 0.00002
+    beta1: 0.9
+    amsgrad: false
+    eps: 0.00000001
diff --git a/example/BinaryDM/configs/celeba.yml b/example/BinaryDM/configs/celeba.yml
@@ -0,0 +1,50 @@
+data:
+    dataset: "CELEBA"
+    image_size: 64
+    channels: 3
+    logit_transform: false
+    uniform_dequantization: false
+    gaussian_dequantization: false
+    random_flip: true
+    rescaled: true
+    num_workers: 4
+
+model:
+    type: "simple"
+    in_channels: 3
+    out_ch: 3
+    ch: 128
+    ch_mult: [1, 2, 2, 2, 4]
+    num_res_blocks: 2
+    attn_resolutions: [16, ]
+    dropout: 0.1
+    var_type: fixedlarge
+    ema_rate: 0.9999
+    ema: True
+    resamp_with_conv: True
+
+diffusion:
+    beta_schedule: linear
+    beta_start: 0.0001
+    beta_end: 0.02
+    num_diffusion_timesteps: 1000
+
+training:
+    batch_size: 128
+    n_epochs: 10000
+    n_iters: 5000000
+    snapshot_freq: 5000
+    validation_freq: 20000
+
+sampling:
+    batch_size: 32
+    last_only: True
+
+optim:
+    weight_decay: 0.000
+    optimizer: "Adam"
+    lr: 0.0002
+    beta1: 0.9
+    amsgrad: false
+    eps: 0.00000001
+    grad_clip: 1.0
diff --git a/example/BinaryDM/configs/church.yml b/example/BinaryDM/configs/church.yml
@@ -0,0 +1,50 @@
+data:
+    dataset: "LSUN"
+    category: "church_outdoor"
+    image_size: 256
+    channels: 3
+    logit_transform: false
+    uniform_dequantization: false
+    gaussian_dequantization: false
+    random_flip: true
+    rescaled: true
+    num_workers: 32
+
+model:
+    type: "simple"
+    in_channels: 3
+    out_ch: 3
+    ch: 128
+    ch_mult: [1, 1, 2, 2, 4, 4]
+    num_res_blocks: 2
+    attn_resolutions: [16, ]
+    dropout: 0.0
+    var_type: fixedsmall
+    ema_rate: 0.999
+    ema: True
+    resamp_with_conv: True
+
+diffusion:
+    beta_schedule: linear
+    beta_start: 0.0001
+    beta_end: 0.02
+    num_diffusion_timesteps: 1000
+
+training:
+    batch_size: 64
+    n_epochs: 10000
+    n_iters: 5000000
+    snapshot_freq: 5000
+    validation_freq: 2000
+
+sampling:
+    batch_size: 32
+    last_only: True
+
+optim:
+    weight_decay: 0.000
+    optimizer: "Adam"
+    lr: 0.00002
+    beta1: 0.9
+    amsgrad: false
+    eps: 0.00000001
diff --git a/example/BinaryDM/configs/cifar10.yml b/example/BinaryDM/configs/cifar10.yml
@@ -0,0 +1,50 @@
+data:
+    dataset: "CIFAR10"
+    image_size: 32
+    channels: 3
+    logit_transform: false
+    uniform_dequantization: false
+    gaussian_dequantization: false
+    random_flip: true
+    rescaled: true
+    num_workers: 4
+
+model:
+    type: "simple"
+    in_channels: 3
+    out_ch: 3
+    ch: 128
+    ch_mult: [1, 2, 2, 2]
+    num_res_blocks: 2
+    attn_resolutions: [16, ]
+    dropout: 0.1
+    var_type: fixedlarge
+    ema_rate: 0.9999
+    ema: True
+    resamp_with_conv: True
+
+diffusion:
+    beta_schedule: linear
+    beta_start: 0.0001
+    beta_end: 0.02
+    num_diffusion_timesteps: 1000
+
+training:
+    batch_size: 128
+    n_epochs: 10000
+    n_iters: 5000000
+    snapshot_freq: 5000
+    validation_freq: 2000
+
+sampling:
+    batch_size: 64
+    last_only: True
+
+optim:
+    weight_decay: 0.000
+    optimizer: "Adam"
+    lr: 0.0002
+    beta1: 0.9
+    amsgrad: false
+    eps: 0.00000001
+    grad_clip: 1.0
diff --git a/example/BinaryDM/configs/cifar10_improved.yml b/example/BinaryDM/configs/cifar10_improved.yml
@@ -0,0 +1,51 @@
+data:
+    dataset: "CIFAR10"
+    image_size: 32
+    channels: 3
+    logit_transform: false
+    uniform_dequantization: false
+    gaussian_dequantization: false
+    random_flip: true
+    rescaled: true
+    num_workers: 4
+
+model:
+    type: "simple"
+    in_channels: 3
+    out_ch: 3
+    ch: 128
+    ch_mult: [1, 2, 2, 2]
+    num_res_blocks: 2
+    attn_resolutions: [16, ]
+    dropout: 0.1
+    var_type: fixedlarge
+    ema_rate: 0.9999
+    ema: True
+    resamp_with_conv: True
+    use_scale_shift_norm: True
+
+diffusion:
+    beta_schedule: cosine
+    beta_start: null
+    beta_end: null
+    num_diffusion_timesteps: 1000
+
+training:
+    batch_size: 128
+    n_epochs: 10000
+    n_iters: 5000000
+    snapshot_freq: 5000
+    validation_freq: 2000
+
+sampling:
+    batch_size: 64
+    last_only: True
+
+optim:
+    weight_decay: 0.000
+    optimizer: "Adam"
+    lr: 0.0002
+    beta1: 0.9
+    amsgrad: false
+    eps: 0.00000001
+    grad_clip: 1.0