Details
HIP: add mmf for CDNA (#18896)
-
refactor mmf rows_per_block
-
speed up compile
-
pass cdna compile
-
fix cuda error
-
clean up mmf
-
f32 mmf
-
clean float mma
-
fix mmf error
-
faster mmf
-
extend tile k
-
fix compile error
-
Revert "extend tile k"
This reverts commit 4d2ef3d.
-
fix smem overflow
-
speed up compiling mmf
-
speed up compile for hip
-
512 block for cdna
-
config pad size
-
fix as comment
-
update select logic
-
move some code to cuh
-
fix as comment
-
correct cdna3 config
Co-authored-by: zhang hui you@example.com
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: