This includes several notable updates:
-
Added gitlab-runners to run our testing code on the GPUs at University of Oregon.
-
Improved triangular solve (complex) performance for special process grids;
Fixed the bug related wrong GPU trisolve results when nrhs>1. -
Added Python interface for 3D factorization and solve.
-
Updated NVSHMEM build script for Perlmutter.
What's Changed
- install .mod files directly to include directory by @minrk in #172
- handle BUILD_STATIC_LIBS in FORTRAN by @minrk in #173
- Add missing declarations in Python bridge header by @nmnobre in #176
- Improvements to the readme file by @nmnobre in #179
- Remove non-prototype declarations of fopen by @nmnobre in #185
- Build system fixes to make HIP dependency work when linking an app against SuperLU_Dist by @japlews in #187
- cmake: let user over-ride CMAKE_CXX_STANDARD option by @balay in #191
- delete CMakeLists.txt.orig by @balay in #192
- Minor updates for CUDA 13 compatibility by @nmnobre in #193
- fix get_acc_offload and get_acc_solve by @tukss in #196
New Contributors
- @nmnobre made their first contribution in #176
- @japlews made their first contribution in #187
- @tukss made their first contribution in #196
Full Changelog: v9.1.0...v9.2.0