github libxsmm/libxsmm 1.0
Version 1.0

latest releases: 1.old_kernelapi_rip, 1.libxsmm_dnn_rip, 1.eol...
8 years ago

This release completes a major refactoring of our library backend while introducing additional capabilities in the frontend (interface). The major update is the ability to generate code Just-In-Time (JIT) i.e., to “compile” matrix-multiplication kernels at run-time of an application. This is achieved by leveraging our reworked code generator, and directly emitting machine byte code into an executable buffer. Despite of the ability to automatically generate any missed kernels, there is nearly no additional overhead: the set of routines in our "CP2K collection" of 386 kernels, are only showing ~3% slow down in average, however LIBXSMM is outperforming their Intel MKL counterparts by ~2X (MKL_DIRECT_CALL), and the Intel Compiler (ICC) generated inlinable code by ~1.5X (on average over the aforementioned 386 kernels). Please consult the README for further details on how to use JIT-compilation.

In addition we have reimplemeted our code dispatch mechanism in order to prepare LIBXSMM for a full xGEMM interface: the assembly-kernel selection is based on a Hashtable using a CRC32 check-sum over an argument structure which is covering all xGEMM arguments already. Given Intel SSE 4.2 capabilities, the calculation is accelerated using CRC32 instructions (which are available on KNL as well). Over the course of the next minor releases we will be bringing JIT compilation out of an experimental state (adjusting code cache eviction, resource cleanup, and portability).

Don't miss a new libxsmm release

NewReleases is sending notifications on new releases.