github ggml-org/llama.cpp b9055

latest releases: b9058, b9057, b9056...
4 hours ago
Details

model: Add Mimo v2.5 model support (#22493)

  • add mimo-v2.5 support

  • mimo-v2.5: fix modify_tensors row split

  • mimi-v2.5: forgot add_attn_value_scale plumbing

  • mimi-v2.5: fix tp dequant to detect tp rows

  • mimo-v2.5: fix TP iteration to be descending

  • mimo-v2.5: fix comment

  • mimo-v2.5: retain fused qkv

  • mimo-v2.5: missed the attn_value scale during merge

  • mimo-v2.5: fused QKV needs contiguous for scaling attention value

  • mimo-v2.5: move speech_embeddings. to TextModel filter_tensors

  • Update src/llama-hparams.h

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update src/models/mimo2.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update src/models/mimo2.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • Update src/models/mimo2.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • mimo-v2.5: include MTP weights in gguf

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.