Details
opencl: add q6_K gemm and gemv kernels for Adreno (#20089)
-
opencl: add q6_K noshuffle kernels, initial q6_K gemv, some host code
-
opencl: add q6_K transpose
-
opencl: fix cvt kernel name
-
opencl: add call to q6_K gemv
-
opencl: fix q6_K scale transpose
-
opencl: fix loading for gemv q6_K, refactor
-
opencl: fix transpose_8_buf kernel assignment, refactor
-
opencl: refactor q6_K transpose
-
opencl: add gemm_noshuffle_q6_k_f32
-
opencl: fix qh loading
-
opencl: refactor q6_K gemv host side, release bufs and imgs
-
opencl: refactor
-
opencl: fix q6_K dequant and scale selection
-
opencl: workaround compiler bug, fix dump_tensor
-
opencl: refactor q6_K convert kernels
-
opencl: unpack transformed q6_K in get_tensor
-
opencl: refactor, handle non-uniform workgroups
-
opencl: support non-vector subgroup bcast
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.1 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler: