github xianyi/OpenBLAS v0.3.14
OpenBLAS 0.3.14 version

NOTE: this has a (now) known regression in AVX512 SGEMM

common:

  • Fixed a race condition on thread shutdown in non-OpenMP builds
  • Fixed custom BUFFERSIZE option getting ignored in gmake builds
  • Fixed CMAKE compilation of the TRMM kernels for GENERIC platforms
  • Added CBLAS interfaces for CROTG, ZROTG, CSROT and ZDROT
  • improved performance of OMATCOPY_RT across all platforms
  • Changed perl scripts to use env instead of a hardcoded /usr/bin/perl
  • Fixed potential misreading of the GCC compiler version in the build scripts
  • Fixed convergence problems in LAPACK complex GGEV/GGES (Reference-LAPACK #477)
  • Reduced the stacksize requirements for running the LAPACK testsuite (Reference-LAPACK #335)

RISC V:

  • Fixed compilation on RISCV (missing entry in getarch)

POWER:

  • Fixed compilation for DYNAMIC_ARCH with clang and with older gcc versions
  • Added support for compilation on FreeBSD/ppc64le
  • Added optimized POWER10 kernels for SSCAL, DSCAL, CSCAL, ZSCAL
  • Added optimized POWER10 kernels for SROT, DROT, CDOT, SASUM, DASUM
  • improved SSWAP, DSWAP, CSWAP, ZSWAP performance on POWER10
  • improved SCOPY and CCOPY performance on POWER10
  • improved SGEMM and DGEMM performance on POWER10
  • Added support for compilation with the NVIDIA HPC compiler

x86_64:

  • Added an optimized bfloat16 GEMM kernel for Cooperlake
  • Added CPUID autodetection for Intel Rocket Lake and Tiger Lake cpus
  • improved the performance of SASUM,DASUM,SROT,DROT on AMD Ryzen cpus
  • Added support for compilation with the NAG Fortran compiler
  • Fixed recognition of the AMD AOCC compiler
  • Fixed compilation for DYNAMIC_ARCH with clang on Windows
  • Added support for running the BLAS/CBLAS tests on Windows
  • Fixed signatures of the tls callback functions for Windows x64
  • Fixed various issues with fma intrinsics support handling

ARM:

  • Support compilation for embedded Cortex M4 targets via a new option EMBEDDED

ARM64:

  • Fixed the THUNDERX2T99 and NEOVERSEN1 DNRM2/ZNRM2 kernels for inputs with Inf
  • Added support for the DYNAMIC_LIST option
  • Added support for compilation with the NVIDIA HPC compiler
  • Added support for compiling with the NAG Fortran compiler

md5sum:
a5aa1d61d4b27f471dc60c40c11e61fe OpenBLAS-0.3.14.tar.gz
f8fe13f5ebf9c4c487784f4e6a7b1a56 OpenBLAS-0.3.14.zip

Download OpenBLAS

latest releases: v0.3.17, v0.3.16, v0.3.15...
pre-release6 months ago