github ggml-org/llama.cpp b7972

3 hours ago
Details

CUDA: Fix non-contig rope (#19338)

  • Rename variables + fix rope_neox

Seems memory layout is shared with Vulkan so we can port fix from
#19299

  • Fix rope_multi

  • Fix rope_vision

  • Fix rope_norm

  • Rename ne* to ne0* for consistent variable naming

  • cont : consistent stride names


Co-authored-by: Georgi Gerganov ggerganov@gmail.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.