ggml-org/llama.cpp b8811
on GitHub

latest releases: b9222, b9221, b9219...

one month ago

Details

ggml-webgpu: compute pass batching and removing profiling overhead (#21873)

Update register tiling matmul to use f32 accumulation
fix profiling code
Fix register tiling matmul for chrome, i'm blaming dawn
Update batch tuning value for iOS
compile fix
Fix use of new load function
Move to a single query set for GPU profiling
Move to batching compute passes when not profiling
Refactor build_multi
remove iOS throttling now that we're batching compute passes

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b8811

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications