github OpenMathLib/OpenBLAS v0.3.33
OpenBLAS 0.3.33 version

12 hours ago

general:

  • fixed an incorrect cast in the SBGEMM test case that could lead to spurious test failures
  • fixed an invalid memory access in the converted C version of the CBLAS tests
  • made the BIGNUMA setting automatic when the number of cores exceeds 256
  • Imported recent updates from Reference-LAPACK to realign with its upcoming 3.13.0 release:
    • Implement ?LARF1F and ?ORM2R (Reference-LAPACK PRs 1019,1020,1196,1257)
    • Change loop order in ?GETC2 to improve performance (Reference-LAPACK PR 1023)
    • Change WORK array dimension in ?GELQS/?GEQRS (Reference-LAPACK PR 1094)
    • Add NaN checks for input matrix A in ?GEEV (Reference-LAPACK PR 1136)
    • Fix support for jobu/v in LAPACKE_?GESVDQ_WORK (Reference-LAPACK PRs 1146,1221)
    • Fix display of version number in LAPACK testsuite (Reference-LAPACK PR 1149)
    • Fix DGGES test seed to avoid bad matrix cases (Reference-LAPACK PR 1187)
    • Fix truncation of large WORK array sizes in ZHE (Reference-LAPACK PR 1195)
    • Fix overwriting of LDSWORK parameter in ?TRSYL3 (Reference-LAPACK PR 1206)
    • Fix overwriting of error states in some EIG tests (Reference-LAPACK PR 1207)
    • Remove unused parameter in DORBDB3/ZUNBDB3 (Reference-LAPACK PR 1209)
    • Re-enable testing of ?BB and ?GG driver functions (Reference-LAPACK PR 1211)
    • Fix workspace size calculation in ?TGSEN (Reference-LAPACK PR 774)
    • Fix typos in the EIG DMD tests and initialized the cutoff variable (PR 1212,1228)
    • Optimized looping in ?LACPY/?LASCL/?LANTR with fat matrix and UPLO=L (PR 1251)

arm64:

  • worked around a serious miscompilation of the DDOT kernel by GCC15, affecting
    most non-SVE targets, and SVE targets in the case of non-unit array stride)
  • fixed an accuracy issue in the GEMV kernel for Neoverse V1 and other SVE targets
  • fixed broken STRMM and SSYMM in DYNAMIC_ARCH builds when running on non-SME hardware
  • added an optimized SHGEMM kernel for Neoverse N2
  • fixed DYNAMIC_ARCH builds under Windows on Arm
  • Added autodetection of Cortex A75/A76 in DYNAMIC_ARCH builds
  • Added autodetection of Neoverse V3, currently supported through V2 kernels
  • Re-added support for the "VORTEX" target in DYNAMIC_ARCH builds with DYNAMIC_LIST
  • Fixed CMake-based builds that use the "Ninja" generator

loongarch64:

  • fixed a build failure due to missing support for the new half-precision float type
  • fixed a long-standing bug in asserting 64bit capability in the c_check helper script

x86_64:

  • added a workaround for miscompilation of the AVX512 GEMM kernels by LLVM on Windows
  • fixed a build failure in the LAED3 code when compiling with MinGW on Windows
  • fixed CMake-based compilation with the NVIDIA HPC compiler
  • Fixed CMake-based builds that use the "Ninja" generator

wasm:

  • added optimized kernels for STRSM and DTRSM

md5sums:
96c5cd9013013faefc294bc57830c77d OpenBLAS-0.3.33.tar.gz
81637d0ac00b6dab6f88988cc35645af OpenBLAS-0.3.33.zip
153b444945694e1b773d2c5e5d2a31b0 OpenBLAS-0.3.33-x86.zip
93022c391fce5298d0576bd25655774b OpenBLAS-0.3.33-x64.zip
e30aab9cfab15a5e0ed4858399ad885a OpenBLAS-0.3.33-x64-64.zip

Download OpenBLAS

Don't miss a new OpenBLAS release

NewReleases is sending notifications on new releases.