github amd/blis 4.0
AOCL-BLIS 4.0

latest releases: 4.2, 4.1, AOCL-3.0-rc6...
19 months ago

Highlights of AOCL-BLIS 4.0

  • The following LPGEMM (Low Precision GEMM) variants are added along with post-ops support:
    • aocl_gemm_u8s8s32os32 and aocl_gemm_u8s8s32os8 routines are added and optimized using AVX-512-VNNI
    • aocl_gemm_u8s8s16os16 and aocl_gemm_u8s8s16os8 routines are added and optimized using AVX2
    • aocl_gemm_bf16bf16f32of32 and aocl_gemm_bf16bf16f32obf16 routines are added and optimized using AVX-512
  • SGEMM with packed/reorder buffer support (aocl_gemm_f32f32f32f32)
  • AMD “Zen4” support for BLIS
  • Dynamic dispatch supports AMD “Zen4” configuration
  • Optimizations and performance improvements for DGEMM, SGEMM, ZGEMM, DGEMMT, and DTRSM
  • Framework design changes

Don't miss a new blis release

NewReleases is sending notifications on new releases.