Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

低精度量化训练(LPT、QAT) #1903

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
112 changes: 112 additions & 0 deletions example/low_precision_training/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Low-precision Training & Quantization-aware Training With PaddlePaddle

## Introduction
**Quantization-aware Training:** network quantization for inference acceleration, i.e., the weights and activations are quantized into low-bit integers so that the forward pass can be converted into integer matrix multiplications.

**Low-precision Training:** network quantization for training acceleration, i.e., the weights, activations as well as backpropagated errors are quantized into low-bit integers so that the forward and backward pass can be converted into integer matrix multiplications.


## Environment Configuration
python 3.10 \
paddlepaddle-gpu 2.6
```sh
cd classification/
pip install -r requirements.txt
cd quantization
python setup.py build
install_paddle_so.py build/cudnn_convolution_custom/lib.linux-x86_64-cpython-310/cudnn_convolution_custom.so
```

## Train
Low-precision Training:
```sh
export CUDA_VISIBLE_DEVICES=4,5,6,7
python -m paddle.distributed.launch --gpus="4,5,6,7" train_qt.py -c ../ppcls/configs/ImageNet/ResNet/ResNet18_custom.yaml
```

```py
config.qconfigdir = '../qt_config/qt_W4_mse_det__A4_mse_det__G4_minmax_sto.yaml'
```
Quantization-Aware Training
```sh
export CUDA_VISIBLE_DEVICES=4,5,6,7
python -m paddle.distributed.launch --gpus="4,5,6,7" train_qat.py -c ../ppcls/configs/ImageNet/ResNet/ResNet18_custom.yaml
```

```py
config.qconfigdir = '../qat_config/qat_w4_mse_a4_mse.yaml'
```
## Evalute
```sh
export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus="0,1,2,3" eval_qt.py -c ../ppcls/configs/ImageNet/ResNet/ResNet18_custom.yaml
```

## Results
Datastes: Imagenet ilsvrc12 \
Model: Resnet18
| config | top-1(%) | top-5(%) |
| ---- | ---- | ---- |
| original train | 70.852 | 89.700 |
| low-precision train (test with w4a4) | 66.498 | 86.858 |
| low-precision train (test with w4) | 67.200 | 87.196 |
| low-precision train (test with fp) | 66.330 | 86.790 |

## 目录结构
```
/classification
├── README.md # 项目说明文件
├── requirements.txt # 项目依赖
├── tool/ # 源代码目录
│ ├── train_py.py # 量化训练主程序入口
│ ├── train.py # 全精度训练主程序入口
│ ├── eval_qt.py # 测试代码入口
├── ppcl/ # 分类网络功能实现模块
├── qt_config/ # 训练量化配置脚本,在此处更改量化配置
├── qat_config/ # Quantization-Aware Training config
└── quantization/ # 量化训练功能实现模块
```

## 量化参数具体说明
以`qt_W4_minmax_det__A4_mse_det__G4_minmax_sto.yaml`为例:

`quantization_manager`: 'qt_quantization_manager'为 Low-precision Training,'qat_quantization_manager'为 Quantization-Aware Training

#### weight_forward_config 正向权重量化参数
`qtype`: "int4"
量化类型。表示权重使用 4 位整数(int4)进行量化。

`granularity`: "N"
量化粒度。"N" 表示按神经网络的各个层级进行量化。

`scaling`: "dynamic"
缩放方法。"dynamic" 表示动态缩放,量化过程中根据数据动态调整缩放因子。

`observer`: "minmax"
量化损失类型。"minmax" 表示使用最小最大值来优化量化参数。

`stochastic`: False
是否使用随机量化。False 表示不使用随机量化。

`filters`:
用于具体指定哪些层或部分进行量化。

id: 0, qtype: "full": 表示 ID 为 0 的层(通常是第一个层)使用全量化("full")。
name: "fc", qtype: "full": 表示名称为 fc 的层(通常是全连接层)使用全量化。

#### weight_backward_config 反向传播权重量化参数
`granularity`: "C"
量化粒度。"C" 表示按通道(channel)进行量化。

#### input_config 输入量化参数
`observer`: "mse"
量化损失类型。"mse" 表示使用均方误差(Mean Squared Error)损失优化量化参数。

#### grad_input_config 输入梯度量化参数
granularity: "NHW"
量化粒度。"NHW" 表示按输入数据的每个维度(batch size, height, width)进行量化。

#### grad_weight_config 权重梯度量化参数
granularity: "NC"
量化粒度。"NC" 表示按通道和输出维度进行量化。

Binary file not shown.
20 changes: 20 additions & 0 deletions example/low_precision_training/ppcls/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from . import optimizer

from .arch import *
from .optimizer import *
from .data import *
from .utils import *
177 changes: 177 additions & 0 deletions example/low_precision_training/ppcls/arch/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
#copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
#
#Licensed under the Apache License, Version 2.0 (the "License");
#you may not use this file except in compliance with the License.
#You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
#Unless required by applicable law or agreed to in writing, software
#distributed under the License is distributed on an "AS IS" BASIS,
#WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#See the License for the specific language governing permissions and
#limitations under the License.

import copy
import importlib
import paddle.nn as nn
from paddle.jit import to_static
from paddle.static import InputSpec

from . import backbone, gears
from .backbone import *
from .gears import build_gear, add_ml_decoder_head
from .utils import *
from .backbone.base.theseus_layer import TheseusLayer
from ..utils import logger
from ..utils.save_load import load_dygraph_pretrain
from .slim import prune_model, quantize_model
from .distill.afd_attention import LinearTransformStudent, LinearTransformTeacher

__all__ = ["build_model", "RecModel", "DistillationModel", "AttentionModel"]


def build_model(config, mode="train"):
arch_config = copy.deepcopy(config["Arch"])
model_type = arch_config.pop("name")
use_sync_bn = arch_config.pop("use_sync_bn", False)
use_ml_decoder = arch_config.pop("use_ml_decoder", False)
mod = importlib.import_module(__name__)
arch = getattr(mod, model_type)(**arch_config)
if use_sync_bn:
if config["Global"]["device"] == "gpu":
arch = nn.SyncBatchNorm.convert_sync_batchnorm(arch)
else:
msg = "SyncBatchNorm can only be used on GPU device. The releated setting has been ignored."
logger.warning(msg)

if use_ml_decoder:
add_ml_decoder_head(arch, config.get("MLDecoder", {}))

if isinstance(arch, TheseusLayer):
prune_model(config, arch)
quantize_model(config, arch, mode)

return arch


def apply_to_static(config, model, is_rec):
support_to_static = config['Global'].get('to_static', False)

if support_to_static:
specs = None
if 'image_shape' in config['Global']:
specs = [InputSpec([None] + config['Global']['image_shape'])]
specs[0].stop_gradient = True
if is_rec:
specs.append(InputSpec([None, 1], 'int64', stop_gradient=True))
model = to_static(model, input_spec=specs)
logger.info("Successfully to apply @to_static with specs: {}".format(
specs))
return model


class RecModel(TheseusLayer):
def __init__(self, **config):
super().__init__()
backbone_config = config["Backbone"]
backbone_name = backbone_config.pop("name")
self.backbone = eval(backbone_name)(**backbone_config)
self.head_feature_from = config.get('head_feature_from', 'neck')

if "BackboneStopLayer" in config:
backbone_stop_layer = config["BackboneStopLayer"]["name"]
self.backbone.stop_after(backbone_stop_layer)

if "Neck" in config:
self.neck = build_gear(config["Neck"])
else:
self.neck = None

if "Head" in config:
self.head = build_gear(config["Head"])
else:
self.head = None

def forward(self, x, label=None):

out = dict()
x = self.backbone(x)
out["backbone"] = x
if self.neck is not None:
feat = self.neck(x)
out["neck"] = feat
out["features"] = out['neck'] if self.neck else x
if self.head is not None:
if self.head_feature_from == 'backbone':
y = self.head(out['backbone'], label)
elif self.head_feature_from == 'neck':
y = self.head(out['features'], label)
out["logits"] = y
return out


class DistillationModel(nn.Layer):
def __init__(self,
models=None,
pretrained_list=None,
freeze_params_list=None,
**kargs):
super().__init__()
assert isinstance(models, list)
self.model_list = []
self.model_name_list = []
if pretrained_list is not None:
assert len(pretrained_list) == len(models)

if freeze_params_list is None:
freeze_params_list = [False] * len(models)
assert len(freeze_params_list) == len(models)
for idx, model_config in enumerate(models):
assert len(model_config) == 1
key = list(model_config.keys())[0]
model_config = model_config[key]
model_name = model_config.pop("name")
model = eval(model_name)(**model_config)

if freeze_params_list[idx]:
for param in model.parameters():
param.trainable = False
self.model_list.append(self.add_sublayer(key, model))
self.model_name_list.append(key)

if pretrained_list is not None:
for idx, pretrained in enumerate(pretrained_list):
if pretrained is not None:
load_dygraph_pretrain(
self.model_name_list[idx], path=pretrained)

def forward(self, x, label=None):
result_dict = dict()
for idx, model_name in enumerate(self.model_name_list):
if label is None:
result_dict[model_name] = self.model_list[idx](x)
else:
result_dict[model_name] = self.model_list[idx](x, label)
return result_dict


class AttentionModel(DistillationModel):
def __init__(self,
models=None,
pretrained_list=None,
freeze_params_list=None,
**kargs):
super().__init__(models, pretrained_list, freeze_params_list, **kargs)

def forward(self, x, label=None):
result_dict = dict()
out = x
for idx, model_name in enumerate(self.model_name_list):
if label is None:
out = self.model_list[idx](out)
result_dict.update(out)
else:
out = self.model_list[idx](out, label)
result_dict.update(out)
return result_dict
Loading