github ROCm/rocBLAS rocm-5.4
rocBLAS 2.46.0 for ROCm 5.4.0

latest releases: rocm-6.2.4, rocm-6.2.2, rocm-test-09212024...
23 months ago

Added

  • client smoke test dataset added for quick validation using command rocblas-test --yaml rocblas_smoke.yaml
  • Added stream order device memory allocation as a non-default beta option.

Optimized

  • Improved trsm performance for small sizes by using a substitution method technique
  • Improved syr2k and her2k performance significantly by using a block-recursive algorithm

Changed

  • Level 2, Level 1, and Extension functions: argument checking when the handle is set to rocblas_pointer_mode_host now returns the status of rocblas_status_invalid_pointer only for pointers that must be dereferenced based on the alpha and beta argument values. With handle mode rocblas_pointer_mode_device only pointers that are always dereferenced regardless of alpha and beta values are checked and so may lead to a return status of rocblas_status_invalid_pointer. This improves consistency with legacy BLAS behaviour.
  • Add variable to turn on/off ieee16/ieee32 tests for mixed precision gemm
  • Allow hipBLAS to select int8 datatype
  • Disallow B == C && ldb != ldc in rocblas_xtrmm_outofplace

Fixed

  • FORTRAN interfaces generalized for FORTRAN compilers other than gfortran
  • fix for trsm_strided_batched rocblas-bench performance gathering
  • Fix for rocm-smi path in commandrunner.py script to match ROCm 5.2 and above

Don't miss a new rocBLAS release

NewReleases is sending notifications on new releases.