This document is used to enable Tensorflow SavedModel format using Intel® Neural Compressor for performance only. This example can run on Intel CPUs and GPUs.
# Install Intel® Neural Compressor
pip install neural-compressor
pip install intel-tensorflow
Note: Supported Tensorflow >= 2.4.0.
Intel Extension for Tensorflow is mandatory to be installed for quantizing the model on Intel GPUs.
pip install --upgrade intel-extension-for-tensorflow[gpu]
For any more details, please follow the procedure in install-gpu-drivers
Intel Extension for Tensorflow for Intel CPUs is experimental currently. It's not mandatory for quantizing the model on Intel CPUs.
pip install --upgrade intel-extension-for-tensorflow[cpu]
Download the model from tensorflow-hub.
image recognition
In examples directory, there are mobilenet_v1.yaml, mobilenet_v2.yaml and efficientnet_v2_b0.yaml for tuning the model on Intel CPUs. The 'framework' in the yaml is set to 'tensorflow'. If running this example on Intel GPUs, the 'framework' should be set to 'tensorflow_itex' and the device in yaml file should be set to 'gpu'. The mobilenet_v1_itex.yaml, mobilenet_v2_itex.yaml and efficientnet_v2_b0_itex.yaml are prepared for the GPU case. We could remove most of items and only keep mandatory item for tuning. We also implement a calibration dataloader and have evaluation field for creation of evaluation function at internal neural_compressor.
bash run_tuning.sh --config=./config.yaml --input_model=./SavedModel --output_model=./nc_SavedModel
bash run_benchmark.sh --config=./config.yaml --input_model=./SavedModel --mode=performance