github ggml-org/llama.cpp b7809

latest releases: b7813, b7812, b7811...
3 hours ago
Details

opencl: enable the general fp mm for non-cont input and as a fallback for specialized kqv kernel for adreno (#18970)

  • opencl: add copy_to_contiguous and utilize mm kernels

  • opencl: only copy to cont for f32 and f16 tensors

  • opencl: use cont mm for fallback when dst is large

  • opencl: use nb local to copy-to-cont

  • opencl: use local offset as well

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.