New Features:
- Tensor columns have been optimized, improving the performance of some networks.
- This includes but is not limited to pruned and quantized YOLOv5s and BERT.
- For networks with subgraphs comprised of low-compute operations.
- Batch size must be a multiple of 16.
- Reduce operators have been further optimized in the Engine.
- C++ API support is available for the DeepSparse Engine.
Changes:
- Performance improvements made for low-precision (8 and 16-bit) datatypes on AVX2.
Resolved Issues:
- Rarely, when several data arrangement operators were in a row, e.g., Reshape, Transpose, or Slice, assertion errors occurred.
- When Pad operators were not followed by convolution or pooling, assertion errors occurred.
- CPU threads migrated between cores when running benchmarks.
Known Issues:
- None