github neuralmagic/deepsparse v0.7.0
DeepSparse v0.7.0

latest releases: v1.8.0, v1.7.1, v1.7.0...
3 years ago

New Features:

  • Operators optimized for Engine support:
    • Where*
    • Cast*
    • IntegerMatMul*
    • QLinearMatMul*
    • Gather (for scalar indices)
      *optimized only for AVX-512 support
  • Flag created to disable any batch size overrides, setting the environment variable "NM_DISABLE_BATCH_OVERRIDE=1".
  • Warnings display when emulating quantized operations on machines without VNNI instructions.
  • Support added for Python 3.9.
  • Support added for ONNX versions 1.8 - 1.10.

Changes:

  • Performance improvements made for sparse quantized transformer models.
  • Documentation updates made for examples/ultralytics-yolo to include YOLOv5.

Resolved Issues:

  • A crash could result with an uninitialized memory read. A check is now in place before trying to access it.
  • Engine output_shape functions corrected on multi-socket systems when the output dimensions are not statically known.

Known Issues:

  • BERT models with quantized embeds currently segfault on AVX2 machines. Workaround is to run on a VNNI-compatible machine.

Don't miss a new deepsparse release

NewReleases is sending notifications on new releases.