github libxsmm/libxsmm 1.4.2
Version 1.4.2

latest releases: 1.old_kernelapi_rip, 1.libxsmm_dnn_rip, 1.eol...
7 years ago

This release implements a number of features which are non-critical for the core functionality but either service a request (API to get/set target architecture, and MATMUL wrapper), or to greatly improve usability for developers (JIT-Profiling, and Verbose Mode).

CHANGES

  • MATMUL-style routines (58fcb41 and e055d65) as thin wrapper around GEMM (FORTRAN only)
  • Issue #75 (frontend function to bypass cpu-id and to set arch_id): API to get/set target architecture
  • Issue #76 (Support JIT-Profiling API): show JIT-kernel insights within Intel VTune Amplifier
  • Issue #78 (Introduce verbose mode): extended termination message (kernel statistics)

Beside of the new features, there are two non-critical fixes. The Issue #77 lead in fact to a non-working call-wrapper mechanism for statically linked GEMM routines when using Intel MKL. The resolution not only fixes the problem, but also unifies the static call interception for all BLAS libraries (documentation is updated accordingly). The other issue was about missing to register statically generated kernels on systems which are actually not supported to JIT-generate code (pre-AVX era); the resolution includes fixes as well as an enhancement.

FIXES

  • Issue #77 (Statically wrapping GEMM calls does work as expected/documented)
  • Fixed registering statically generated code (de0af05 b05a02c, and f093543)

There is also an enhancement which became possible in version 1.4.1 (2230568), however the size of GEMM descriptor entries was not reduced because the SIMD-padding was not updated (applies to code registry and thread-local cache). Another enhancement (3610639) addressed along with Issue #75 was the extension of the available code paths where AVX-512 is now handled in two flavors (MIC and CORE). This information is currently not used to generate different code (everything is AVX-512F i.e., foundational instructions), but to eventually load different platform defaults.

Note: the new API for getting/setting the target architecture was partly present in previous releases (getter). However, this release not only adds the setter functionality but also slightly changes (in an incompatible fashion) the previously implemented wrapper. The renamed getter function also comes along with a renamed environment variable (c7ea23c: LIBXSMM_JIT has been renamed to LIBXSMM_TARGET).

Don't miss a new libxsmm release

NewReleases is sending notifications on new releases.