github ggml-org/llama.cpp b8008

latest release: b8011
4 hours ago
Details

hexagon: further optimization and tuning of matmul and dot kernels (#19407)

  • ggml-hexagon: implement 2x2 matmul kernel

  • hexmm: implement vec_dot_rx2x2 for Q8_0 and MXFP4

  • hexagon: fix editor config failures

  • hexagon: refactor matmul ops to use context struct and remove wrappers

Also implement vec_dot_f16 2x2

  • hexagon: refactor dyn quantizers to use mmctx

  • hexagon: remove mm fastdiv from op_ctx

  • hexagon: refactor matmul entry point to reduce code duplication


Co-authored-by: Trivikram Reddy tamarnat@qti.qualcomm.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.