github ggml-org/llama.cpp b8814

latest releases: b8816, b8815
5 hours ago
Details

ggml-cpu: add 128-bit RVV implementation for Quantization Vector Dot (#20633)

  • ggml-cpu: add 128-bit impls for i-quants, ternary quants

  • ggml-cpu: add 128-bit impls for iq2_xs, iq3_s, iq3_xxs, tq2_0

Co-authored-by: Rehan Qasim rehan.qasim@10xengineers.ai

  • ggml-cpu: refactor; add rvv checks

Co-authored-by: taimur-10x taimur.ahmad@10xengineers.ai
Co-authored-by: Rehan Qasim rehan.qasim@10xengineers.ai

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.