Additions
- Level 1 and Level 1 Extension functions have additional ILP64 API for both C and FORTRAN (_64 name suffix) with int64_t function arguments.
- Cache flush timing for gemm_ex.
Changes
- Some Level 2 function argument names have changed 'm' to 'n' to match legacy BLAS, there was no change in implementation.
- Standardized the use of non-blocking streams for copying results from device to host.
Fixes
- Fixed host-pointer mode reductions for non-blocking streams.