github ggml-org/llama.cpp b9260

one hour ago
Details

opencl: refactor backend initilization (#23318)

  • opencl: refactor initialization

  • opencl: refactor GPU identification

  • opencl: rename for consistency

  • opencl: cache global mem size in dev_ctx

  • opencl: adjust log level

  • opencl: load argsort and flash_attn kernels in supports_op

  • argsort kernel must be built for supports_op for querying the max
    workgroups

  • flash_attn kernel has many variants, only load them when needed

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.