github ggml-org/llama.cpp b8857

latest releases: b8860, b8861, b8859...
2 hours ago
Details

ggml-webgpu: updated matrix-vector multiplication (#21738)

  • merged properly, but slow q3_k and q5_k with u32 indexing

  • Start on new mat-vec

  • New format float paths working

  • Working q4_0

  • Work on remaining legacy q-types

  • port k-quants to new matvec

  • remove old shader

  • Remove old constants, format

  • remove accidental file


Co-authored-by: Neha Abbas nehaabbas@ReeseLevines-MacBook-Pro.local
Co-authored-by: Reese Levine reeselevine1@gmail.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.