Summary of major features and improvements
- More integrations, minimizing code changes
- Now you can load TensorFlow and TensorFlow Lite models directly in OpenVINO Runtime and OpenVINO Model Server. Models are converted automatically. For maximum performance, it is still recommended to convert to OpenVINO Intermediate Representation or IR format before loading the model. Additionally, we’ve introduced a similar functionality with PyTorch models as a preview feature where you can convert PyTorch models directly without needing to convert to ONNX.
- Support for Python 3.11
- NEW: C++ developers can now install OpenVINO runtime from Conda Forge
- NEW: ARM processors are now supported in CPU plug-in, including dynamic shapes, full processor performance, and broad sample code/notebook coverage. Officially validated for Raspberry Pi 4 and Apple® Mac M1/M2
- Preview: A new Python API has been introduced to allow developers to convert and optimize models directly from Python scripts
- Broader model support and optimizations
- Expanded model support for generative AI: CLIP, BLIP, Stable Diffusion 2.0, text processing models, transformer models (i.e. S-BERT, GPT-J, etc.), and others of note: Detectron2, Paddle Slim, RNN-T, Segment Anything Model (SAM), Whisper, and YOLOv8 to name a few.
- Initial support for dynamic shapes on GPU - you no longer need to change to static shapes when leveraging the GPU which is especially important for NLP models.
- Neural Network Compression Framework (NNCF) is now the main quantization solution. You can use it for both post-training optimization and quantization-aware training. Try it out:
pip install nncf
- Portability and performance
- CPU plugin now offers thread scheduling on 12th gen Intel® Core and up. You can choose to run inference on E-cores, P-cores, or both, depending on your application’s configurations. It is now possible to optimize for performance or for power savings as needed.
- NEW: Default Inference Precision - no matter which device you use, OpenVINO will default to the format that enables its optimal performance. For example, FP16 for GPU or BF16 for 4th Generation Intel® Xeon®. You no longer need to convert the model beforehand to specific IR precision, and you still have the option of running in accuracy mode if needed.
- Model caching on GPU is now improved with more efficient model loading/compiling.
You can find OpenVINO™ toolkit 2023.0 release here:
- Download archives* with OpenVINO™ Runtime for C/C++
- OpenVINO™ Runtime for Python:
pip install openvino==2023.0.0
- OpenVINO™ Development tools:
pip install openvino-dev==2023.0.0
Release Notes are available here: https://www.intel.com/content/www/us/en/developer/articles/release-notes/openvino-relnotes.html