Release v2.15.0 · openvinotoolkit/nncf

(TensorFlow) The nncf.quantize() method is now the recommended API for Quantization-Aware Training. Please refer to an example for more details about how to use a new approach.
(TensorFlow) Compression layers placement in the model now can be serialized and restored with new API functions: nncf.tensorflow.get_config() and nncf.tensorflow.load_from_config(). Please see the documentation for the saving/loading of a quantized model for more details.
(OpenVINO) Added example with LLM quantization to FP8 precision.
(TorchFX, Experimental) Preview support for the new quantize_pt2e API has been introduced, enabling quantization of torch.fx.GraphModule models with the OpenVINOQuantizer and the X86InductorQuantizer quantizers. quantize_pt2e API utilizes MinMax algorithm statistic collectors, as well as SmoothQuant, BiasCorrection and FastBiasCorrection Post-Training Quantization algorithms.
Added unification of scales for ScaledDotProductAttention operation.

(ONNX) Fixed sporadic accuracy issues with the BiasCorrection algorithm.
(ONNX) Fixed GroupConvolution operation weight quantization, which also improves performance for a number of models.
Fixed AccuracyAwareQuantization algorithm to solve #3118 issue.
Fixed issue with NNCF usage with potentially corrupted backend frameworks.

(TorchFX, Experimental) Added YoloV11 support.
(OpenvINO) The performance of the FastBiasCorrection algorithm was improved.
Significantly faster data-free weight compression for OpenVINO models: INT4 compression is now up to 10x faster, while INT8 compression is up to 3x faster. The larger the model the higher the time reduction.
AWQ weight compression is now up to 2x faster, improving overall runtime efficiency.
Peak memory usage during INT4 data-free weight compression in the OpenVINO backend is reduced by up to 50% for certain models.

(TensorFlow) The nncf.tensorflow.create_compressed_model() method is now marked as deprecated. Please use the nncf.quantize() method for the quantization initialization.

Acknowledgements

Thanks for contributions from the OpenVINO developer community:
@rk119
@devesh-2002

Provide feedback