github libxsmm/libxsmm 1.12
Version 1.12

latest releases: 1.old_kernelapi_rip, 1.libxsmm_dnn_rip, 1.eol...
5 years ago

This release aims to improve usability along with resolving several non-critical bugs. Beyond this, an implementation of the BLAS(-like) batched GEMM has been added (?GEMM_BATCH). The interface currently only supports the C/C++ language. However, it can be called implicitly (Fortran 77 like) or used by intercepting existing calls (static and dynamic linkage).

LIBXSMM has an interface for batched GEMMs since several versions supporting pointers as well as arrays of indexes plus Byte-sized strides to extract data from arrays of structures (AoS). The new BLAS interface only supports straight arrays of pointers to operand matrices but allows multiple groups of homogeneous batches. All batch interfaces are implemented in sequential (ST) and multi-threaded (MT) form plus synchronization in case of MT.

INTRODUCED

  • Interface and implementation of batched GEMMs (GEMM_BATCH).
  • Tensorflow wrapper code for LSTM operation.
  • Interceptor for GEMMM_BATCH, and GEMV.

IMPROVEMENTS / CHANGES

  • LSTM: enabled additional tensor formats for Bfloat16.
  • Validated with GNU GCC 9.1 release.

FIXES

Don't miss a new libxsmm release

NewReleases is sending notifications on new releases.