github libxsmm/libxsmm 1.10
Version 1.10

latest releases: 1.old_kernelapi_rip, 1.libxsmm_dnn_rip, 1.eol...
5 years ago

Development accumulated many changes since the last release (v1.9) as this version (v1.10) kept slipping because of validation was not able to keep up and started over several times. On the positive side this may allow to call it the "Supercomputing 2018 Edition" which is complemented by an updated list of references including the SC'18 paper "Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures". Among several external articles, the Parallel Universe Magazine published "LIBXSMM: An Open Source-Based Inspiration for Hardware and Software Development at Intel".

The intense development of LIBXSMM brought many improvements and detailed features across domains as well as end-to-end support for Bfloat16 in LIBXSMM's Deep Learning domain (DL). The latter can be already exercised with the GxM framework which was added to the collection of sample codes. Testing and validation were updated for latest compilers and upcoming Linux distributions. FreeBSD is now formally supported (previously it was only tested occasionally). RPM-, Debian- and FreeBSD package updates will benefit from the smoothed default build-targets and compiler flags.

LIBXSMM supports "one build for all" while exploiting the existing instructions set extensions (CPUID based code-dispatch). Developers may enjoy support for pkg-config (.pc files in the lib folder) for easier linkage when using the Classic ABI (e.g., PKG_CONFIG_PATH=/path/to/libxsmm/lib pkg-config libxsmm --libs).

THANK YOU FOR YOUR CONTRIBUTION - we had several direct (and indirect) contributions, reports, and involvement from people who came across the project. We would like to thank you all for the effort and time you spent working on Open Source!

INTRODUCED

  • Removed need to build LIBXSMM's static library in a special way for GEMM call-interception.
  • Moved some previously internal but generally useful code to the public interface (math etc.).
  • Initial support handle-based "big" GEMM (revamped libxsmm_?gemm_omp).
  • Support transposed cases in libxsmm_?gemm_omp; not perf.-competitive yet.
  • Code samples accompanying article in the Parallel Universe magazine.
  • Fortran interface for some previously only C-exposed functions.
  • Support Intel C/++ Compiler together with GNU Fortran.
  • Packed/SOA domain: expanded functionality (EDGE solver).
  • Deep Learning framework GxM (added as code sample).
  • RNNs, and LSTM/GRU-cell (driver code experimental).
  • End-to-end support for Bfloat16 (DL domain).
  • Fused batch-norm, and fully-connected layer.
  • Compact/packed TRSM kernels and interface.
  • Experimental TRMM code (no interface yet).
  • Support for pkg-config.

IMPROVEMENTS / CHANGES

  • Zero-mask unused register parts to avoid false positives with enabled FPEs (MM kernels).
  • Added libxsmm_ptrx helper to Fortran interface (works around C_LOC portability issue).
  • Mapped TF low-precision to appropriate types, map unknowns to DATATYPE_UNSUPPORTED.
  • Build banner with platform name, info about Intel VTune (available but JIT-profiling disabled).
  • Smoothed code base for most recent compilers (incl. improved target attribution).
  • Official packages for Debian, and FreeBSD (incl. OpenMP in libxsmm/ext for BSD).
  • LIBXSMM_DUMP environment var. writes MHD-files if libxsmm_matdiff is called.
  • Warn when libxsmm_release_kernel is called for registered kernel.
  • Consolidated Deep Learning sample codes into one folder.
  • Revised default for AVX=3 (MIC=0 is now implicitly set).
  • LIBXSMM_TARGET: more keys count for AVX512/Core.
  • Updated TF integration/documentation.
  • Included workarounds for flang (LLVM).
  • Attempt to enable OpenMP with Clang.
  • Install header-only form (make install).
  • SpMDM code dispatch for AVX2.
  • Improved CI/test infrastructure.
  • Show hint if compilation fails.

FIXES

  • Properly dispatch CRC32 instruction (support older CPUs).
  • Fixed fallback of statically generated MM kernels (rare).
  • Remove temporary files that were previously dangling.
  • Fixed termination message/statistic (code registry).
  • Fixed finalizing the library (corner case).
  • Fixed code portability of DNN domain.

Don't miss a new libxsmm release

NewReleases is sending notifications on new releases.