nnieqat-pytorch

This is a quantize aware training package for Neural Network Inference Engine(NNIE) on pytorch, it uses hisilicon quantization library to quantize module's weight and input data as fake fp32 format. To train model which is more friendly to NNIE, just import nnieqat and replace torch.nn default modules with corresponding one.

Note: import nniepat before torch modules, do not support multi-gpu training.

Installation

Supported Platforms: Linux
Accelerators and GPUs: NVIDIA GPUs via CUDA driver 10.1 or 10.2.
Dependencies:
- python >= 3.5, < 4
- llvmlite >= 0.31.0
- pytorch >= 1.5
- numba >= 0.42.0
- numpy >= 1.18.1
Install nnieqat via pypi:
```
$ pip install nnieqat
```
Install nnieqat in docker(easy way to solve environment problems)：
```
$ cd docker
$ docker build -t nnieqat-image .
```

Usage

add quantization hook.

quantize and dequantize weight and data with HiSVP GFPQ library in forward() process.

from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
  register_quantization_hook(model)
...

merge bn weight into conv and freeze bn

suggest finetuning from a well-trained model, merge_freeze_bn at beginning. do it after a few epochs of training otherwise.

from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
    model.train()
    model = merge_freeze_bn(model)  #it will change bn to eval() mode during training
...

Unquantize weight before update it

from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
    model.apply(unquant_weight)  # using original weight while updating
    optimizer.step()
...

Dump weight optimized model

from nnieqat import quant_dequant_weight, unquant_weight, merge_freeze_bn, register_quantization_hook
...
...
    model.apply(quant_dequant_weight)
    save_checkpoint(...)
    model.apply(unquant_weight)
...

Code Examples

Cifar10 quantization aware training example (add nnieqat into pytorch_cifar10_tutorial)

python test/test_cifar10.py
ImageNet quantization finetuning example (add nnieqat into pytorh_imagenet_main.py)

python test/test_imagenet.py --pretrained path_to_imagenet_dataset

Results

ImageNet

python test/test_imagenet.py /data/imgnet/ --arch squeezenet1_1  --lr 0.001 --pretrained --epoch 10   # nnie_lr_e-3_ft
python pytorh_imagenet_main.py /data/imgnet/ --arch squeezenet1_1  --lr 0.0001 --pretrained --epoch 10  # lr_e-4_ft
python test/test_imagenet.py /data/imgnet/ --arch squeezenet1_1  --lr 0.0001 --pretrained --epoch 10  # nnie_lr_e-4_ft

finetune result：

	trt_fp32	trt_int8	nnie
torchvision	0.56992	0.56424	0.56026
nnie_lr_e-3_ft	0.56600	0.56328	0.56612
lr_e-4_ft	0.57884	0.57502	0.57542
nnie_lr_e-4_ft	0.57834	0.57524	0.57730

Todo

Multiple GPU training support.
Generate quantized model directly.

Reference

HiSVP 量化库使用指南

Quantizing deep convolutional networks for efficient inference: A whitepaper

8-bit Inference with TensorRT

Distilling the Knowledge in a Neural Network

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docker		docker
docs		docs
nnieqat		nnieqat
test		test
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
build_helper.py		build_helper.py
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nnieqat-pytorch

Table of Contents

Installation

Usage

Code Examples

Results

Todo

Reference

About

Releases

Packages

Languages

License

tarfnet/nnieqat-pytorch

Folders and files

Latest commit

History

Repository files navigation

nnieqat-pytorch

Table of Contents

Installation

Usage

Code Examples

Results

Todo

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages