github oneapi-src/oneDNN v1.7

latest releases: v3.4.2, v3.5-pc, v3.4.1...
3 years ago

Performance optimizations

  • Intel Processor Graphics and Xe architecture-based Graphics:
    • Improved performance of convolutions and matmul primitives.
    • Improved performance of int8 convolutions for NHWC activations format.
  • Intel Architecture processors:
    • Improved performance of primitives for NHWC activations format.
    • Improved fp32 GEMM performance for small N
    • Improved performance of int8 primitives for processors with Intel SSE4.1 instruction set support.
  • AArch64-based processors
    • Added support for Arm Performance Library (ArmPL). ArmPL provides optimized GEMM implementation for aarch64.
    • Added support for Arm Compute Library (ArmCL). ArmCL provides optimized convolution implementation for aarch64.

New Functionality

  • Added support for IBMz (s390x) and IBM POWER (powerpc64) architectures
  • Introduced RNN GRU for GPU.
  • Introduced int8 RNN GRU for CPU
  • Introduced asymmetric quantization support for convolutions and matmul
  • Introduced dilated pooling support.
  • Extended matmul primitive to support multiple dimensions in batch and broadcast on CPU.
  • (preview) Introduced binary post-op for (de)-convolution, pooling, eltwise, binary, inner product, and matmul.
  • (preview) Extended the number of supported post-ops for primitives to 20.
  • (preview) Introduced reduction primitive for CPU. Together with post-ops this functionality allows to implement normalization.

Thanks to the contributors

This release contains contributions from the project core team as well as Ben Fitch, Brian Shi, David Edelsohn @edelsohn, Diana Bite @diaena, Moaz Reyad @moazreyad, Nathan John Sircombe @nSircombe, Niels Dekker @N-Dekker, Peter Caday @petercad, Pinzhen Xu @pinzhenx, pkubaj @pkubaj, Tsao Zhong @CaoZhongZ. We would also like to thank everyone who asked questions and reported issues.

Don't miss a new oneDNN release

NewReleases is sending notifications on new releases.