github ggml-org/llama.cpp b8858

latest releases: b8864, b8863, b8862...
11 hours ago
Details

ggml-cpu: Optimized x86 and generic cpu q1_0 dot (follow up) (#21636)

  • Implemented optimized q1_0 dot for x86 and generic

  • Removed redundant helper definition

  • Removed two redundant instructions from AVX q1_0 dot

  • Fixed inconsistency with fp16 conversion for generic q1_0 dot and deduplicated generic fallback

  • Style cleanup around AVX q1_0 dot

  • Replaced explicitly unrolled blocks with inner for loop for q1_0

  • Replaced scalar ARM q1_0 impl with new generic one

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.