ggml-org/llama.cpp b7649
on GitHub

latest releases: b8183, b8182, b8181...

one month ago

Details

ggml : optimize cuda ssm_scan using warp-level reduction (#18505)

ggml : optimize cuda ssm_scan using warp-level reduction
ggml : apply code review suggestions (style, const, constexpr)
ggml : add TODO regarding stride consistency

macOS/iOS:

Linux:

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b7649

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications