ggml-org/llama.cpp b8008
on GitHub

latest releases: b9888, b9886, b9885...

4 months ago

Details

hexagon: further optimization and tuning of matmul and dot kernels (#19407)

ggml-hexagon: implement 2x2 matmul kernel
hexmm: implement vec_dot_rx2x2 for Q8_0 and MXFP4
hexagon: fix editor config failures
hexagon: refactor matmul ops to use context struct and remove wrappers

Also implement vec_dot_f16 2x2

hexagon: refactor dyn quantizers to use mmctx
hexagon: remove mm fastdiv from op_ctx
hexagon: refactor matmul entry point to reduce code duplication

Co-authored-by: Trivikram Reddy tamarnat@qti.qualcomm.com

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b8008

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications