PaddlePaddle · peiswang · Dec 13, 2024
diff --git a/example/low_precision_training/README.md b/example/low_precision_training/README.md
@@ -0,0 +1,112 @@
+# Low-precision Training & Quantization-aware Training With PaddlePaddle
+
+## Introduction
+**Quantization-aware Training:** network quantization for inference acceleration, i.e., the weights and activations are quantized into low-bit integers so that the forward pass can be converted into integer matrix multiplications.
+
+**Low-precision Training:** network quantization for training acceleration, i.e., the weights, activations as well as backpropagated errors are quantized into low-bit integers so that the forward and backward pass can be converted into integer matrix multiplications.
+
+
+## Environment Configuration
+python 3.10 \
+paddlepaddle-gpu 2.6
+```sh
+cd classification/
+pip install -r requirements.txt
+cd quantization
+python setup.py build
+install_paddle_so.py build/cudnn_convolution_custom/lib.linux-x86_64-cpython-310/cudnn_convolution_custom.so
+```
+
+## Train
+Low-precision Training: 
+```sh
+export CUDA_VISIBLE_DEVICES=4,5,6,7
+python -m paddle.distributed.launch --gpus="4,5,6,7" train_qt.py -c ../ppcls/configs/ImageNet/ResNet/ResNet18_custom.yaml
+```
+
+```py
+config.qconfigdir = '../qt_config/qt_W4_mse_det__A4_mse_det__G4_minmax_sto.yaml'
+```
+Quantization-Aware Training
+```sh
+export CUDA_VISIBLE_DEVICES=4,5,6,7
+python -m paddle.distributed.launch --gpus="4,5,6,7" train_qat.py -c ../ppcls/configs/ImageNet/ResNet/ResNet18_custom.yaml
+```
+
+```py
+config.qconfigdir = '../qat_config/qat_w4_mse_a4_mse.yaml'
+```
+## Evalute
+```sh
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus="0,1,2,3" eval_qt.py -c ../ppcls/configs/ImageNet/ResNet/ResNet18_custom.yaml
+```
+
+## Results
+Datastes: Imagenet ilsvrc12 \
+Model: Resnet18
+|  config   | top-1(%)  | top-5(%)  |
+|  ----  | ----  | ----  |
+| original train  | 70.852 | 89.700 |
+| low-precision train (test with w4a4)  | 66.498 | 86.858 |
+| low-precision train (test with w4)  | 67.200 | 87.196 |
+| low-precision train (test with fp)    | 66.330 | 86.790 |
+
+## 目录结构
+```
+/classification
+├── README.md           # 项目说明文件
+├── requirements.txt    # 项目依赖
+├── tool/               # 源代码目录
+│   ├── train_py.py     # 量化训练主程序入口
+│   ├── train.py        # 全精度训练主程序入口
+│   ├── eval_qt.py      # 测试代码入口
+├── ppcl/               # 分类网络功能实现模块
+├── qt_config/          # 训练量化配置脚本，在此处更改量化配置
+├── qat_config/         # Quantization-Aware Training config
+└── quantization/       # 量化训练功能实现模块
+```
+
+## 量化参数具体说明
+以`qt_W4_minmax_det__A4_mse_det__G4_minmax_sto.yaml`为例：
+
+`quantization_manager`: 'qt_quantization_manager'为 Low-precision Training，'qat_quantization_manager'为 Quantization-Aware Training
+
+#### weight_forward_config 正向权重量化参数
+`qtype`: "int4"
+量化类型。表示权重使用 4 位整数（int4）进行量化。
+
+`granularity`: "N"
+量化粒度。"N" 表示按神经网络的各个层级进行量化。
+
+`scaling`: "dynamic"
+缩放方法。"dynamic" 表示动态缩放，量化过程中根据数据动态调整缩放因子。
+
+`observer`: "minmax"
+量化损失类型。"minmax" 表示使用最小最大值来优化量化参数。
+
+`stochastic`: False
+是否使用随机量化。False 表示不使用随机量化。
+
+`filters`:
+用于具体指定哪些层或部分进行量化。
+
+id: 0, qtype: "full": 表示 ID 为 0 的层（通常是第一个层）使用全量化（"full"）。
+name: "fc", qtype: "full": 表示名称为 fc 的层（通常是全连接层）使用全量化。
+
+#### weight_backward_config 反向传播权重量化参数
+`granularity`: "C"
+量化粒度。"C" 表示按通道（channel）进行量化。
+
+#### input_config 输入量化参数
+`observer`: "mse"
+量化损失类型。"mse" 表示使用均方误差（Mean Squared Error）损失优化量化参数。
+
+#### grad_input_config 输入梯度量化参数
+granularity: "NHW"
+量化粒度。"NHW" 表示按输入数据的每个维度（batch size, height, width）进行量化。
+
+#### grad_weight_config 权重梯度量化参数
+granularity: "NC"
+量化粒度。"NC" 表示按通道和输出维度进行量化。
+
diff --git a/example/low_precision_training/ppcls/.DS_Store b/example/low_precision_training/ppcls/.DS_Store
diff --git a/example/low_precision_training/ppcls/__init__.py b/example/low_precision_training/ppcls/__init__.py
@@ -0,0 +1,20 @@
+# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from . import optimizer
+
+from .arch import *
+from .optimizer import *
+from .data import *
+from .utils import *
diff --git a/example/low_precision_training/ppcls/arch/__init__.py b/example/low_precision_training/ppcls/arch/__init__.py
@@ -0,0 +1,177 @@
+#copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+#Licensed under the Apache License, Version 2.0 (the "License");
+#you may not use this file except in compliance with the License.
+#You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+#Unless required by applicable law or agreed to in writing, software
+#distributed under the License is distributed on an "AS IS" BASIS,
+#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#See the License for the specific language governing permissions and
+#limitations under the License.
+
+import copy
+import importlib
+import paddle.nn as nn
+from paddle.jit import to_static
+from paddle.static import InputSpec
+
+from . import backbone, gears
+from .backbone import *
+from .gears import build_gear, add_ml_decoder_head
+from .utils import *
+from .backbone.base.theseus_layer import TheseusLayer
+from ..utils import logger
+from ..utils.save_load import load_dygraph_pretrain
+from .slim import prune_model, quantize_model
+from .distill.afd_attention import LinearTransformStudent, LinearTransformTeacher
+
+__all__ = ["build_model", "RecModel", "DistillationModel", "AttentionModel"]
+
+
+def build_model(config, mode="train"):
+    arch_config = copy.deepcopy(config["Arch"])
+    model_type = arch_config.pop("name")
+    use_sync_bn = arch_config.pop("use_sync_bn", False)
+    use_ml_decoder = arch_config.pop("use_ml_decoder", False)
+    mod = importlib.import_module(__name__)
+    arch = getattr(mod, model_type)(**arch_config)
+    if use_sync_bn:
+        if config["Global"]["device"] == "gpu":
+            arch = nn.SyncBatchNorm.convert_sync_batchnorm(arch)
+        else:
+            msg = "SyncBatchNorm can only be used on GPU device. The releated setting has been ignored."
+            logger.warning(msg)
+
+    if use_ml_decoder:
+        add_ml_decoder_head(arch, config.get("MLDecoder", {}))
+
+    if isinstance(arch, TheseusLayer):
+        prune_model(config, arch)
+        quantize_model(config, arch, mode)
+
+    return arch
+
+
+def apply_to_static(config, model, is_rec):
+    support_to_static = config['Global'].get('to_static', False)
+
+    if support_to_static:
+        specs = None
+        if 'image_shape' in config['Global']:
+            specs = [InputSpec([None] + config['Global']['image_shape'])]
+            specs[0].stop_gradient = True
+            if is_rec:
+                specs.append(InputSpec([None, 1], 'int64', stop_gradient=True))
+        model = to_static(model, input_spec=specs)
+        logger.info("Successfully to apply @to_static with specs: {}".format(
+            specs))
+    return model
+
+
+class RecModel(TheseusLayer):
+    def __init__(self, **config):
+        super().__init__()
+        backbone_config = config["Backbone"]
+        backbone_name = backbone_config.pop("name")
+        self.backbone = eval(backbone_name)(**backbone_config)
+        self.head_feature_from = config.get('head_feature_from', 'neck')
+
+        if "BackboneStopLayer" in config:
+            backbone_stop_layer = config["BackboneStopLayer"]["name"]
+            self.backbone.stop_after(backbone_stop_layer)
+
+        if "Neck" in config:
+            self.neck = build_gear(config["Neck"])
+        else:
+            self.neck = None
+
+        if "Head" in config:
+            self.head = build_gear(config["Head"])
+        else:
+            self.head = None
+
+    def forward(self, x, label=None):
+
+        out = dict()
+        x = self.backbone(x)
+        out["backbone"] = x
+        if self.neck is not None:
+            feat = self.neck(x)
+            out["neck"] = feat
+        out["features"] = out['neck'] if self.neck else x
+        if self.head is not None:
+            if self.head_feature_from == 'backbone':
+                y = self.head(out['backbone'], label)
+            elif self.head_feature_from == 'neck':
+                y = self.head(out['features'], label)
+            out["logits"] = y
+        return out
+
+
+class DistillationModel(nn.Layer):
+    def __init__(self,
+                 models=None,
+                 pretrained_list=None,
+                 freeze_params_list=None,
+                 **kargs):
+        super().__init__()
+        assert isinstance(models, list)
+        self.model_list = []
+        self.model_name_list = []
+        if pretrained_list is not None:
+            assert len(pretrained_list) == len(models)
+
+        if freeze_params_list is None:
+            freeze_params_list = [False] * len(models)
+        assert len(freeze_params_list) == len(models)
+        for idx, model_config in enumerate(models):
+            assert len(model_config) == 1
+            key = list(model_config.keys())[0]
+            model_config = model_config[key]
+            model_name = model_config.pop("name")
+            model = eval(model_name)(**model_config)
+
+            if freeze_params_list[idx]:
+                for param in model.parameters():
+                    param.trainable = False
+            self.model_list.append(self.add_sublayer(key, model))
+            self.model_name_list.append(key)
+
+        if pretrained_list is not None:
+            for idx, pretrained in enumerate(pretrained_list):
+                if pretrained is not None:
+                    load_dygraph_pretrain(
+                        self.model_name_list[idx], path=pretrained)
+
+    def forward(self, x, label=None):
+        result_dict = dict()
+        for idx, model_name in enumerate(self.model_name_list):
+            if label is None:
+                result_dict[model_name] = self.model_list[idx](x)
+            else:
+                result_dict[model_name] = self.model_list[idx](x, label)
+        return result_dict
+
+
+class AttentionModel(DistillationModel):
+    def __init__(self,
+                 models=None,
+                 pretrained_list=None,
+                 freeze_params_list=None,
+                 **kargs):
+        super().__init__(models, pretrained_list, freeze_params_list, **kargs)
+
+    def forward(self, x, label=None):
+        result_dict = dict()
+        out = x
+        for idx, model_name in enumerate(self.model_name_list):
+            if label is None:
+                out = self.model_list[idx](out)
+                result_dict.update(out)
+            else:
+                out = self.model_list[idx](out, label)
+                result_dict.update(out)
+        return result_dict