github amd/blis 5.2
AOCL 5.2 GA Release

8 days ago

AOCL-BLAS 5.2 Release Notes

Overview

This release includes significant performance improvements, new features, and critical bug fixes for the AOCL - BLAS linear algebra library, with optimizations specifically targeting AMD Zen4 and Zen5 architectures.


Performance Improvements

GEMM Improvements

  • Tuned ZGEMM thresholds for Zen4 and Zen5 architectures
  • Optimized AVX512 ZGEMM kernel and edge-case handling
  • Improved ZGEMM packing kernel for M-dimension edge cases
  • Developed Optimal thread selection logic for ZGEMM on Zen5

GEMV Enhancements

  • Added DGEMV no-transpose multithreaded implementations
  • Exported AVX512 DGEMV kernels
  • DGEMV bug fixes and code cleanup
  • Added ability to handle non-unit incx in GEMV transpose kernel
  • Improved numerical precision in ZGEMV API

DCOPY Optimization

  • Tuned DCOPY aocl_dynamic logic for Zen4/Zen5 architectures

New Features

  • Additional build options to disable optimized code paths for smaller matrices in GEMM and TRSM

    • Useful for testing and benchmarking
    • Reduces numerical rounding differences when repeating calculations with different core counts
  • Complete set of GEMMTR APIs implemented


Bug Fixes

Critical Fixes

  • Fixed probable integer overflow in TPSV
  • Fixed ZTRSM accuracy for conjugate transpose
  • Fixed DTRSM small threshold for extremely skinny sizes on Zen5

Acknowledgments

This release is the result of contributions from the AOCL team at AMD and the broader BLIS community.


Release Date: January 2026
Version: 5.2 GA

Don't miss a new blis release

NewReleases is sending notifications on new releases.