ggml-org/llama.cpp b9260
on GitHub

latest releases: b9948, b9947, b9951...

one month ago

Details

opencl: refactor backend initilization (#23318)

opencl: refactor initialization
opencl: refactor GPU identification
opencl: rename for consistency
opencl: cache global mem size in dev_ctx
opencl: adjust log level
opencl: load argsort and flash_attn kernels in supports_op
argsort kernel must be built for supports_op for querying the max
workgroups
flash_attn kernel has many variants, only load them when needed

macOS/iOS:

Linux:

Android:

Android arm64 (CPU)

Windows:

openEuler:

Check out latest releases or
releases around ggml-org/llama.cpp b9260

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.

Get notifications