Release 2024.3.0 · openvinotoolkit/openvino

Summary of major features and improvements  

More Gen AI coverage and framework integrations to minimize code changes
- OpenVINO pre-optimized models are now available in Hugging Face making it easier for developers to get started with these models.
Broader Large Language Model (LLM) support and more model compression techniques.
- Significant improvement in LLM performance on Intel discrete GPUs with the addition of Multi-Head Attention (MHA) and OneDNN enhancements.
More portability and performance to run AI at the edge, in the cloud, or locally.
- Improved CPU performance when serving LLMs with the inclusion of vLLM and continuous batching in the OpenVINO Model Server (OVMS). vLLM is an easy-to-use open-source library that supports efficient LLM inferencing and model serving.
- Ubuntu 24.04 long-term support (LTS), 64-bit (Kernel 6.8+) (preview support)

Support Change and Deprecation Notices

Using deprecated features and components is not advised. They are available to enable a smooth transition to new solutions and will be discontinued in the future. To keep using discontinued features, you will have to revert to the last LTS OpenVINO version supporting them. For more details, refer to the OpenVINO Legacy Features and Components page.
Discontinued in 2024.0:
- Runtime components:
  - Intel® Gaussian & Neural Accelerator (Intel® GNA)..Consider using the Neural Processing Unit (NPU) for low-powered systems like Intel® Core™ Ultra or 14th generation and beyond.
  - OpenVINO C++/C/Python 1.0 APIs (see 2023.3 API transition guide for reference).
  - All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)
  - 'PerfomanceMode.UNDEFINED' property as part of the OpenVINO Python API
- Tools:
  - Deployment Manager. See installation and deployment guides for current distribution options.
  - Accuracy Checker.
  - Post-Training Optimization Tool (POT). Neural Network Compression Framework (NNCF) should be used instead.
  - A Git patch for NNCF integration with huggingface/transformers. The recommended approach is to use huggingface/optimum-intel for applying NNCF optimization on top of models from Hugging Face.
  - Support for Apache MXNet, Caffe, and Kaldi model formats. Conversion to ONNX may be used as a solution.
Deprecated and to be removed in the future:
- The OpenVINO™ Development Tools package (pip install openvino-dev) will be removed from installation options and distribution channels beginning with OpenVINO 2025.0.
- Model Optimizer will be discontinued with OpenVINO 2025.0. Consider using the new conversion methods instead. For more details, see the model conversion transition guide.
- OpenVINO property Affinity API will be discontinued with OpenVINO 2025.0. It will be replaced with CPU binding configurations (ov::hint::enable_cpu_pinning).
- OpenVINO Model Server components:
  - “auto shape” and “auto batch size” (reshaping a model in runtime) will be removed in the future. OpenVINO’s dynamic shape models are recommended instead.
- A number of notebooks have been deprecated. For an up-to-date listing of available notebooks, refer to the OpenVINO™ Notebook index (openvinotoolkit.github.io).

You can find OpenVINO™ toolkit 2024.3 release here:

Download archives* with OpenVINO™
Install it via Conda: conda install -c conda-forge openvino=2024.3.0
OpenVINO™ for Python: pip install openvino==2024.3.0

Acknowledgements

Thanks for contributions from the OpenVINO developer community:
@rghvsh
@PRATHAM-SPS
@duydl
@awayzjj
@jvr0123
@inbasperu
@DannyVlasenko
@amkarn258
@kcin96
@Vladislav-Denisov

Release documentation is available here: https://docs.openvino.ai/2024
Release Notes are available here: https://docs.openvino.ai/2024/about-openvino/release-notes-openvino.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024.3.0

Summary of major features and improvements

More Gen AI coverage and framework integrations to minimize code changes

Broader Large Language Model (LLM) support and more model compression techniques.

More portability and performance to run AI at the edge, in the cloud, or locally.

Support Change and Deprecation Notices

Contributors