github ggml-org/llama.cpp b7973

latest release: b7974
6 hours ago
Details

[Model] Qwen3.5 dense and MoE support (no vision) (#19435)

  • Unified delta net handling

  • Remove old methods.

  • Refactor and optimize

  • Adapt autoregressive version from @ymcki

  • Change to decay mask approach

  • Fix bad permute

  • Qwen 3.5 support

  • Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Further fixes

  • Use inheritance, remove unneeded conts

  • Not like this!

  • Remove ggml.h explicit import

  • Remove transformers, fix the views

  • ACTUALLY fix views, make super calls explicit in conversion.

  • Fix conversion again

  • Remove extra ggml.h imports


Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

macOS/iOS:

Linux:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.