github ggml-org/llama.cpp b7990

latest release: b7991
3 hours ago
Details

models : support qwen3.5 series (#19468)

  • support qwen3.5 series

  • remove deepstack for now, and some code clean

  • code clean

  • add FULL_ATTENTION_INTERVAL metadata

  • code clean

  • reorder v heads for linear attention to avoid expensive interleaved repeat

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.