github libxsmm/libxsmm 1.6.2
Version 1.6.2

latest releases: 1.old_kernelapi_rip, 1.libxsmm_dnn_rip, 1.eol...
7 years ago

This is a maintenance release, which focuses (again) on the DNN API. However, this version includes bug-fixes for a number of severe issues, which have been found in various domains (SMM, DNN, SPMDM, and in general).

INTRODUCED

  • Documented header-only implementation of LIBXSMM
  • DNN: introduced routine to check code gen. (libxsmm_dnn_get_codegen_success)
  • DNN: introduced routine for explicit transpose (libxsmm_dnn_transpose_filter)
  • DNN: introduced to query number of tasks (libxsmm_dnn_get_parallel_tasks)
  • DNN: support external filter reduction in case of parallelization over the minibatch
  • MEM: exposed routine to query size of buffer allocated by libxsmm_[aligned_]malloc
  • SPMDM: introduced support for beta, code optimizations

CHANGES

  • SPMDM: improved static code path selection (no CPUID dispatch)
  • SMM: raised THRESHOLD until which JIT code is automatically generated
  • Raised baseline code path to SSE4.2 to avoid CPUID-dispatched CRC32;
    fixed (again) controlling the static code path according to documentation
  • Adjusted separation between gen-library and main library
  • MEM/debug: checksum for internal bookkeeping structure
  • MEM: streamlined internal bookkeeping structures
  • Improved reliability of library initialization

FIXES

  • SMM: evtl. wrong code version under concurrent dispatch under hash key collision
  • DNN: raised/fixed weight update performance to the expected level (AVX-512)
  • DNN: fixed a bug which was introduced by code refactoring (fwd. convolution)
  • DNN: fixed bug in Deepbench and refactored backward convolution code
  • DNN: corrected setting up the handle for the weight update convolution
  • MEM: fixed kernel-dump related console output (print correct address)
  • Avoid certain (pseudo-)AVX-512 intrinsics, which might be not present (GCC)
  • Avoid AVX-512/Core intrinsics prior to Clang 3.8 (3.9 brings them in)
  • Avoid to apply AVX-512/Core flags with earlier versions of Clang (IDEs)
  • Updated C++ entry points for code dispatch (remainder of issue #105);
    this change fixed performance issue with CP2K/intel branch
  • SPMDM: fixed issue for N if not a multiple of 16

Don't miss a new libxsmm release

NewReleases is sending notifications on new releases.