Skip to content

r1marcus/pytorch-quantization

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pytorch Quantization

Introduction

This repository implements a set of quantization strategies to be applied to supported type of layers.

The code originally started from the Pytorch and ATen implementation of a fused GRU/LSTM, extracted as a CFFI extension and expanded from there.

Requirements

Building currently requires an appropriate CUDA environment, but execution is supported on CPU as well.

  • Nvidia CUDA Toolkit (tested with CUDA 9.0)
  • Pytorch (tested with version 0.3.1)

Installation

  1. Run python build.py
  2. Add current path to the python path: EXPORT PYTHONPATH=/path/to/pytorch-quantization:PYTHONPATH

Usage

The following quantization modes are implemented for weights:

  • FP: full-precision, no quantization performed.
  • SIGNED_FIXED_UNIT: fixed point quantization between [-1,1).

The following quantization modes are implemented for activations:

  • FP: full-precision, no quantization performed.
  • SIGNED_FIXED_UNIT: fixed point quantization between [-1,1).

The following quantized layers are implemented:

  • QuantizedLinear
  • QuantizedLSTM

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Cuda 34.9%
  • Python 21.1%
  • CMake 15.2%
  • C 14.2%
  • C++ 11.5%
  • Makefile 3.0%
  • Shell 0.1%