New Features:
- Bfloat16 is now supported on CPUs with the AVX512_BF16 extension. Users can expect up to 30% performance improvement for sparse FP32 networks and an up to 75% performance improvement for dense FP32 networks. This feature is opt-in and is specified with the
default_precision
parameter in the configuration file. - Several options can now be specified using a configuration file.
- Max and min operators are now supported for performance.
- SQuAD 2.0 support provided.
- NLP multi-label and eval support added.
- Fraction of supported operations property added to
engine
class. - New ML Ops logging capabilities implemented, including metrics logging, custom functions, and Prometheus support.
Changes:
- Minimum Python version set to 3.7.
- The default logging level has been changed to
warn
. - Timing functions and a default no-op deallocator have been added to improve usability of the C++ API.
- DeepSparse now supports the
axes
parameter to be specified either as an input or an attribute in several ONNX operators. - Model compilation times have been improved on machines with many cores.
- YOLOv5 pipelines upgraded to latest state from Ultralytics.
- Transformers pipelines upgraded to latest state from Hugging Face.
Resolved Issues:
- DeepSparse no longer crashes with an assertion failure for softmax operators on dimensions with a single element.
- DeepSparse no longer crashes with an assertion failure on some unstructured sparse quantized BERT models.
- Image classification evaluation script no longer crashes for larger batch sizes.
Known Issues:
- None