github ggml-org/llama.cpp b7841

latest releases: b7843, b7842
2 hours ago
Details

opencl: add flattened q6_K mv (#19054)

  • opencl: flatten q6_K and add kernel_mul_mv_q6_K_f32_flat

  • opencl: clean up

  • opencl: refactor q6_K mv - put loop body in block_q_6_K_dot_y_flat

  • opencl: tweak the workgroup size a bit

  • opencl: output 4 values per subgroup for kernel_mul_mv_q6_K_f32_flat

  • opencl: proper alignment for q6_K

  • opencl: boundary handling for flattened q6_K mv

  • opencl: rename q6_K mv kernel file

  • opencl: put flattened q6_K mv in its own file

  • opencl: use lower k in file name

  • opencl: use K in variable names

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.