github libxsmm/libxsmm 1.16
Version 1.16

latest releases: 1.old_kernelapi_rip, 1.libxsmm_dnn_rip, 1.eol...
3 years ago

This is a maintenance release which is meant to capture the project´s continuous development into a stable release. A validated release allows our users to leverage several improvements and fixes (see below) especially in the light of upcoming new features.

THANK YOU FOR YOUR CONTRIBUTION - your contribution matters! This project received several contributions whether as pull request, issue report, feature suggestion, or as an informal inquiry. We would like to thank you for your effort and time spent for Open Source software!

INTRODUCED

  • Zero-config for all platforms with absolutely no configuration needed for header-only. Simplifies using Visual Studio as no up-front configuration or in-build custom steps are needed. Simplifies 3rd-party build systems incorporating LIBXSMM for both header-only and classic ABI.
  • Updated Hello LIBXSMM, and added code examples for C/C++ and Fortran, included minimal "support" for Bazel (request). The latter is not meant to change our Makefile based build setup but can rather help to get people started who prefer Bazel.
  • Fortran interface for user-data dispatch and a Fortran code sample using this interface to dispatch multiple kernels at once. The C interface was introduced earlier (v1.15).
  • Experimental: element-wise kernels with matrix elements (meltw), e.g., to scale, reduce, type-convert, etc.

IMPROVEMENTS / CHANGES

  • Extended [list of applications](https://libxsmm.readthedocs.io/#applications and projects) using LIBXSMM. Our documentation also lists applications among popular categories (at the bottom of the left-hand side menu).
  • Fixed performance bug in matcopy routine; added microbenchmarks.
  • Improved verbose output (watermarks, additional warnings).
  • Disabled memory wrapper at compile-time (opt-in only).
  • Fully moved to Python3 shebang (fallback to Python2).
  • Improved Fortran interface (overloads, etc.).
  • Further improved support for GNU GCC 10.
  • Extended sparse functionality.

FIXES

  • Avoid manipulating GNU´s feature flags (improves header-only library).
  • Fixed detecting Intel VTune 2020 (SYM=1 with source'd profiler).
  • Consistently emit unaligned LD/ST (intrinsics based code).

Don't miss a new libxsmm release

NewReleases is sending notifications on new releases.