ggml-org/llama.cpp b8493
on GitHub

latest releases: b9102, b9101, b9100...

one month ago

Details

opencl: add q6_K gemm and gemv kernels for Adreno (#20089)

opencl: add q6_K noshuffle kernels, initial q6_K gemv, some host code
opencl: add q6_K transpose
opencl: fix cvt kernel name
opencl: add call to q6_K gemv
opencl: fix q6_K scale transpose
opencl: fix loading for gemv q6_K, refactor
opencl: fix transpose_8_buf kernel assignment, refactor
opencl: refactor q6_K transpose
opencl: add gemm_noshuffle_q6_k_f32
opencl: fix qh loading
opencl: refactor q6_K gemv host side, release bufs and imgs
opencl: refactor
opencl: fix q6_K dequant and scale selection
opencl: workaround compiler bug, fix dump_tensor
opencl: refactor q6_K convert kernels
opencl: unpack transformed q6_K in get_tensor
opencl: refactor, handle non-uniform workgroups
opencl: support non-vector subgroup bcast

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b8493

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications