github ggml-org/llama.cpp b9128

2 hours ago
Details

hexagon: eliminate scalar VTCM loads via HVX splat helpers (#22993)

  • hexagon: add hvx_vec_repl helpers and use those for splat-from-vtcm usecase

  • hmx-mm: optimize per-group scale handling

  • hmx-fa: optimize slope load from vtcm

  • hmx-fa: use aligned access where possible in hmx-utils

  • hexagon: add hvx_vec_repl_2x_f16 helper and consolidate repl helpers


Co-authored-by: Max Krasnyansky maxk@qti.qualcomm.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.