github ggml-org/llama.cpp b9006

2 hours ago
Details

opencl: Adreno optimization for MoE - MxFP4 (#22301)

  • MoE Mxfp4 CLC kernel added, router reorder on GPU

  • Pass test-backend-ops for MoE mxfp4 Adreno CLC

  • remove putenv in llama-model.cpp

  • fix indent style and whitespace

  • opencl: remove unnecessary headers

  • opencl: do not save cl_program objects

  • opencl: remove unnecessary assert

  • fix precision issue


Co-authored-by: Li He lih@qti.qualcomm.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.