github oneapi-src/oneDNN v1.6

latest releases: v3.6.1, v3.6, v3.6-rc...
4 years ago

Performance optimizations

Intel Architecture processors

  • Introduced initial int8 optimizations for future Intel Xeon Scalable processor (code name Sapphire Rapids). The functionality is disabled by default and should be enabled via CPU dispatcher control.
  • Improved matmul and inner product performance with bfloat16 data type.
  • Improved performance of tanh algorithm for eltwise primitive and LSTM cells.

Intel Processor Graphics and Xe architecture-based Graphics

  • Improved performance of Convolution, RNN, Inner Product and Matmul functionality for all supported GPUs.
  • Improved performance of int8 convolutions with activations in NHWC format for Xe architecture-based Graphics (code named DG1 and Tiger Lake).

AArch64-based processors

  • Added support for ArmPL library to improve performance of functionality relying on GEMM (matmul, inner product, convolutions).

New Functionality

  • Introduced support for processors based on IBM POWER architecture.
  • Introduced Linear-Before-Reset GRU for GPU.
  • Extended eltwise primitive with support for round operation.

Usability

  • Reduced primitives creation time due to enabled OpenCL pre-compiled headers feature in recent versions of OpenCL driver.
  • Reduced entitlement required on macOS with hardened runtime to allow-jit.
  • Extended documentation on runtime and build time controls for JIT profilers support, primitive cache, CPU dispatcher controls, and verbose mode.

Validation

  • Introduced validation mode for out of memory situations.

Thanks to the contributors

This release contains contributions from the project core team as well as Alberto Gonzalez Palomo @AlbertoGP, Arthur Mitrano @aaraujom, Ilia Taraban @itaraban, Nathan John Sircombe @nSircombe, Peter Caday @petercad, Tsao Zhong @CaoZhongZ. We would also like to thank everyone who asked questions and reported issues.

Don't miss a new oneDNN release

NewReleases is sending notifications on new releases.