This module is an extension for the Hugging Face Optimum library and brings an OpenVINO™ backend for Hugging Face Transformers 🤗.
This project provides APIs to enable the following tools for use with Hugging Face models:
Supported Python versions: 3.7, 3.8, 3.9.
Install openvino-optimum runtime:
pip install openvino-optimum
or with all dependencies (nncf
and openvino-dev
):
pip install openvino-optimum[all]
Since this module is being actively developed, we recommend to install the latest version from Github, with:
pip install --upgrade "git+https://github.com/openvinotoolkit/openvino_contrib.git#egg=openvino-optimum&subdirectory=modules/optimum"
or with all dependencies:
pip install --upgrade "git+https://github.com/openvinotoolkit/openvino_contrib.git#egg=openvino-optimum[all]&subdirectory=modules/optimum"
To use PyTorch or TensorFlow models, these frameworks should be installed as well, for example with pip install torch==1.9.*
or pip install tensorflow==2.5.*
. To use TensorFlow models, openvino-dev is required. This can either be installed seperately with pip install openvino-dev
or by choosing the all dependencies option.
This module provides an inference API for Hugging Face models. It is possible to use PyTorch or TensorFlow models, or to use the native OpenVINO IR format (a pair of files ov_model.xml
and ov_model.bin
). When using PyTorch or TensorFlow models, openvino-optimum converts the model in the background, for use with OpenVINO Runtime.
To use the OpenVINO backend, import one of the AutoModel
classes with an OV
prefix. Specify a model name or local path in the from_pretrained
method. When specifying a model name from Hugging Face's Model Hub, for example bert-base-uncased
, the model will be downloaded and converted in the background. To load an OpenVINO IR file, <name_or_path>
should be the directory that contains ov_model.xml
and ov_model.bin
. If this directory does not contain a configuration file, a config
parameter should also be specified.
from optimum.intel.openvino import OVAutoModel
# PyTorch trained model with OpenVINO backend
model = OVAutoModel.from_pretrained(<name_or_path>, from_pt=True)
# TensorFlow trained model with OpenVINO backend
model = OVAutoModel.from_pretrained(<name_or_path>, from_tf=True)
# Initialize a model from OpenVINO IR
from transformers import AutoConfig
config = AutoConfig.from_pretrained(model_name)
model = OVAutoModel.from_pretrained(<name_or_path>, config=config)
To save a model that was loaded from PyTorch or TensorFlow to OpenVINO's IR format, use the save_pretrained()
method:
model.save_pretrained(<model_directory>)
ov_model.xml
and ov_model.bin
will be saved in model_directory
.
For a complete example of how to do inference on a Hugging Face model with openvino-optimum, please check out the Fill-Mask demo
To use the NNCF component, install openvino-optimum with the [nncf]
or [all]
extras:
pip install openvino-optimum[nncf]
NNCF is used for model training with applying such features like quantization, pruning. To enable NNCF in your training pipeline do the following steps:
- Import
NNCFAutoConfig
:
from optimum.intel.nncf import NNCFAutoConfig
NOTE:
NNCFAutoConfig
must be imported beforetransformers
to make the magic work
- Initialize an NNCF configuration object from a
.json
file:
nncf_config = NNCFAutoConfig.from_json(training_args.nncf_config)
- Pass the NNCF configuration to the
Trainer
object. For example:
model = AutoModelForQuestionAnswering.from_pretrained(<name_or_path>)
...
trainer = QuestionAnsweringTrainer(
model=model,
args=training_args,
train_dataset=train_dataset if training_args.do_train else None,
eval_dataset=eval_dataset if training_args.do_eval else None,
eval_examples=eval_examples if training_args.do_eval else None,
tokenizer=tokenizer,
data_collator=data_collator,
post_process_function=post_processing_function,
compute_metrics=compute_metrics,
nncf_config=nncf_config,
)
NOTE: The NNCF module is independent from the Runtime module. The
model
class for NNCF should be a regular Transformers model, not anOVAutoModel
.
Example config files can be found in the nncf/configs directory in this repository.
Training examples can be found in the Transformers library. To use them with NNCF, modify the code to add nncf_config
as outlined above, and add --nncf_config
with the path to the NNCF config file when training your model. For example:
python examples/pytorch/token-classification/run_ner.py --model_name_or_path bert-base-cased --dataset_name conll2003 --output_dir bert_base_cased_conll_int8 --do_train --do_eval --save_strategy epoch --evaluation_strategy epoch --nncf_config nncf_bert_config_conll.json
More command line examples with Hugging Face demos can be found in the NNCF repository. Note that the installation steps and patching the repository are not necessary when using the NNCF integration in openvino-optimum.
See the Changelog page for details about module development.
*Other names and brands may be claimed as the property of others.