Developer Documentation

Read the following material as you learn how to use Neural Compressor.

Get Started

Transform introduces how to utilize Neural Compressor's built-in data processing and how to develop a custom data processing method.
Dataset introduces how to utilize Neural Compressor's built-in dataset and how to develop a custom dataset.
Metrics introduces how to utilize Neural Compressor's built-in metrics and how to develop a custom metric.
UX is a web-based system used to simplify Neural Compressor usage.
Intel oneAPI AI Analytics Toolkit Get Started Guide explains the AI Kit components, installation and configuration guides, and instructions for building and running sample apps.
AI and Analytics Samples includes code samples for Intel oneAPI libraries.

.. toctree::
    :maxdepth: 1
    :hidden:

    transform.md
    dataset.md
    metric.md
    ux.md
    Intel oneAPI AI Analytics Toolkit Get Started Guide <https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top.html>
    AI and Analytics Samples <https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics>

Deep Dive

Quantization are processes that enable inference and training by performing computations at low-precision data types, such as fixed-point integers. Neural Compressor supports Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT). Note that Dynamic Quantization currently has limited support.
Pruning provides a common method for introducing sparsity in weights and activations.
Benchmarking introduces how to utilize the benchmark interface of Neural Compressor.
Mixed precision introduces how to enable mixed precision, including BFP16 and int8 and FP32, on Intel platforms during tuning.
Graph Optimization introduces how to enable graph optimization for FP32 and auto-mixed precision.
Model Conversion <model_conversion.md> introduces how to convert TensorFlow QAT model to quantized model running on Intel platforms.
TensorBoard provides tensor histograms and execution graphs for tuning debugging purposes.

.. toctree::
    :maxdepth: 1
    :hidden:

    Quantization.md
    PTQ.md
    QAT.md
    dynamic_quantization.md
    pruning.md
    benchmark.md
    mixed_precision.md
    graph_optimization.md
    model_conversion.md
    tensorboard.md

Advanced Topics

Adaptor is the interface between Neural Compressor and framework. The method to develop adaptor extension is introduced with ONNX Runtime as example.
Tuning strategies can automatically optimized low-precision recipes for deep learning models to achieve optimal product objectives like inference performance and memory usage with expected accuracy criteria. The method to develop a new strategy is introduced.

.. toctree::
    :maxdepth: 1
    :hidden:

    adaptor.md
    tuning_strategies.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doclist.rst

doclist.rst

Developer Documentation

Get Started

Deep Dive

Advanced Topics

Files

doclist.rst

Latest commit

History

doclist.rst

File metadata and controls

Developer Documentation

Get Started

Deep Dive

Advanced Topics