Examples

The following examples are added:

Passes (optimization techniques)

PyTorch
- Introduce GenAIModelExporter pass to export a PyTorch model using GenAI exporter.
- Introduce LoftQ pass which performs model fine-tuning using the LoftQ initialization proposed in https://arxiv.org/abs/2310.08659.
ONNXRuntime
- Introduce DynamicToFixedShape pass to convert dynamic shape to fixed shape for ONNX model.
- Introduce OnnxOpVersionConversion pass to convert an existing ONNX model with another target opset.
- [QNN-EP] Add the option of prepare_qnn_config:bool for quantization under QNN-EP where the int16/uint16 are supported both for weights and activation.
- [QNN-EP] Introduce QNNPreprocess pass to preprocess the model before quantization.
QNN
- Introduce QNNConversion pass to convert models to QNN C++ model.
- Introduce QNNContextBinaryGenerator pass to generate the context binary from a compiled model library using a specific backend.
- Introduce QNNModelLibGenerator pass to compile the C++ model into a model library for the desired target.

OnnxConversion
- Support both past_key_values.index.key/value and past_key_value.index.
OptimumConversion
- Provide parameter components if the user wants to export only some models such as decoder_model and decoder_with_past_model.
- Uses the default exporter args and behavior of the underlying optimum version. For versions 1.14.0+, this means legacy=False and no_post_process=False. User must provide them using extra_args if legacy behavior is desired.
OpenVINO
- Upgrade OpenVINO API to 2023.2.0.
OrtPerTuning
- Add tunable_op_enable and tunable_op_tuning_enable for ROCM ep to speed up the performance.
LoRA/QLoRA
- Support bfloat16 with ort-training.
- Support resuming training from checkpoint by
  - resume_from_checkpoint option.
  - overwrite_output_dir option.
MoEExpertsDistributor
- Add option to configure number of parallel jobs.

As for Zipfile packaging, add models rank json file. This file ranks all output models from different EPs. This json file includes model_config and metrics.
Add Auto Optimizer which is a tool that can be used to automatically search Olive passes combination.

Add hf_token support for Olive systems.
AzureMLSystem
- Olive config file will be uploaded to AML jobs under codes folder.
- Support adding tags to the AML jobs.
- Support using existing AML workspace Environment for AzureMLSystem.
DockerSystem
- Support running Olive Pass.
PythonEnvironmentSystem requires Olive to be installed in the environment. It can run passes and evaluate models.
New IsolatedORTSystem introduced that only supports evaluation of ONNX models. It requires onnxruntime to be installed in the environment. Can be used to for packages like onnxruntime-qnn which can only be run on Windows ARM64 python environment.

Support onnxruntime 1.17.1.