github ggml-org/llama.cpp b8315

latest releases: b8322, b8320, b8318...
2 hours ago
Details

vulkan: fix SSM_CONV PP scaling with large ubatch sizes (#20379)

  • vulkan: optimize SSM_CONV workgroup dispatch for large ubatch

Tile tokens into 2D workgroups (32x16) to reduce workgroup launch
overhead at large ubatch sizes. Add vec4 fast path for nc=4 (common
d_conv size). Fixes PP performance degradation with ubatch > 512.

Ref: #18725

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

  • vulkan: remove unused shared memory declaration in SSM_CONV

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com


Co-authored-by: Progeny Alpha ProgenyAlpha@users.noreply.github.com
Co-authored-by: Claude Opus 4.6 noreply@anthropic.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.