ggml-org/llama.cpp b7549
on GitHub

latest releases: b8250, b8249, b8248...

2 months ago

Details

vulkan: preprocess mul_mat_id experts and discard workgroups more quickly (#18352)

Run a preprocess to count how many times each expert is used, and use this to
quickly discard workgroups that aren't needed.

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b7549

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications