New Features:
- Multi-stream scheduler added as a configurable option to the engine.
Changes:
- Errors related to setting the NUMA memory policy are now issued as warnings.
- Improved compilation times for sparse networks.
- Performance improvements made for: networks with large outputs and multi-socket machines; ResNet-50 v1 quantized and kernel sparsity gemms.
- Copy operations and placement of quantization operations within network optimized.
- Version changed to be loaded from version.py file, default build on branches is now nightly.
- cpu.py file and related APIs added to DeepSparse repo instead of copying over from backend.
- Add unsupported system install errors for end users when running on non-Linux systems.
- YOLOv3 batch 64 quantized now has a speedup of 16% in the DeepSparse Engine.
Resolved Issues:
- An assertion is no longer triggered when more sockets or threads than available are requested.
- Resolved assertion when performing Concat operations on constant buffers.
- Engine no longer crashes when the output of a QLinearMatMul operation has a dimension not divisible by 4.
- The engine now starts without crashing on Windows Subsystem for Linux and Docker for Windows or Docker for Mac.
Known Issues:
- None