Details
hexagon: further optimization and tuning of matmul and dot kernels (#19407)
-
ggml-hexagon: implement 2x2 matmul kernel
-
hexmm: implement vec_dot_rx2x2 for Q8_0 and MXFP4
-
hexagon: fix editor config failures
-
hexagon: refactor matmul ops to use context struct and remove wrappers
Also implement vec_dot_f16 2x2
-
hexagon: refactor dyn quantizers to use mmctx
-
hexagon: remove mm fastdiv from op_ctx
-
hexagon: refactor matmul entry point to reduce code duplication
Co-authored-by: Trivikram Reddy tamarnat@qti.qualcomm.com
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: