github ggml-org/llama.cpp b8191

2 hours ago
Details

opencl: add optimized q4_1 mm kernel for adreno (#19840)

  • Add Q4_1 OpenCL Kernels

  • opencl: refactor transpose

  • opencl: format

  • opencl: refactor q4_1 unpack

  • opencl: move ggml_cl_mul_mat_q4_1_f32_adreno

  • opencl: refactor ggml_cl_mul_mat_q4_1_f32_adreno and kernels

  • opencl: rename kernel files and kernes

  • opencl: fix build for non adreno

  • opencl: move code around and format


Co-authored-by: Li He lih@qti.qualcomm.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.