github amd/blis 2.2
AMD Optimized BLIS Version 2.2

latest releases: 4.2, 4.1, AOCL-3.0-rc6...
3 years ago

AMD Optimized BLIS Version 2.2

Highlights of improvements on AMD EPYCTM processor family CPUs

  • Improved performance for Level-1 BLAS routines for single and double precision.
  • Improved performance of SGEMV and DGEMV for large sizes.
  • Enabled small unpacked(SUP) GEMM kernels for single precision and double precision complex (C,Z) GEMM
  • Multi-threaded small unpacked(SUP) GEMM kernels enabled for (S,D,C,Z) GEMM providing improved performance for small/skinny matrices.
  • GEMM Selective packing feature is now multithread enabled. Selective packing feature packs either A or B or both the matrices and can be enabled by setting environment variable. Refer AOCL User Guide at https://developer.amd.com/amd-aocl/ for details
  • Improved TRSM single-thread and multi-thread performance for large and skinny matrices
  • Debug trace and log feature enabled for debug purposes.

Don't miss a new blis release

NewReleases is sending notifications on new releases.