github libxsmm/libxsmm 0.9.0
Version 0.9

latest releases: 1.old_kernelapi_rip, 1.libxsmm_dnn_rip, 1.eol...
8 years ago

This release settles the assembly code generator as the default code generation mechanism. The library targets Intel SSE3, AVX, AVX2, IMCI/KNCni, and Intel AVX-512 (foundational) instructions using optimized assembly code. Restrictions for the shape of the generated kernels are relaxed or actually removed, and the documentation is updated accordingly. The build system is now handling an empty code specialization request such that only an inlinable code path, and the BLAS fallback code are generated. The build system also respects the problem size threshold when generating code according to the requested specialization. The former milestone item to report some performance results is also addressed in published documentation. Moreover, additional code samples has been collected allowing an easier start as compared to the more complex CP2K proxy sample code. The documentation now starts with a Q&A section (answering how to quickly check whether LIBXSMM is beneficial for an application). In short, this release attempts to deliver a stable and complete library according to the former specification, and prepares for upcoming roadmap items such as a full xGEMM interface, and other features.

Don't miss a new libxsmm release

NewReleases is sending notifications on new releases.