ggml-org/llama.cpp b8857
on GitHub

latest releases: b10133, b10121, b10107...

3 months ago

Details

ggml-webgpu: updated matrix-vector multiplication (#21738)

merged properly, but slow q3_k and q5_k with u32 indexing
Start on new mat-vec
New format float paths working
Working q4_0
Work on remaining legacy q-types
port k-quants to new matvec
remove old shader
Remove old constants, format
remove accidental file

Co-authored-by: Neha Abbas nehaabbas@ReeseLevines-MacBook-Pro.local
Co-authored-by: Reese Levine reeselevine1@gmail.com

macOS/iOS:

Linux:

Android:

Android arm64 (CPU)

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b8857

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications