github ggml-org/llama.cpp b9258

latest releases: b9260, b9259
one hour ago
Details

mtmd : DeepSeek-OCR image processing fixes, img_tool::resize padding refactor (#23345)

  • mtmd : deepseek-ocr fixes, improvements and refactoring
  • image processing changes to achieve full parity with Pillow (reference impl)
  • SAM mask casting only when flash-attn is on
  • SAM refactor (build_sam() extracted so deepseek-ocr-2 can reuse it)
  • llama-chat changes to fix server/WebUI issue (new media_markers_first())
  • adapted test-chat-template and added test cases for deepseek-ocr
  • changed regression test for deepseek-ocr to use CER+chrF scores for ground-truth comparison; removed embedding-model
  • ty.toml ignore unresolved-import for tools/mtmd/tests/**
  • image-text reordering fix removed

  • refactor bool add_padding + pad_rounding enum into a single pad_style enum

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

Don't miss a new llama.cpp release

NewReleases is sending notifications on new releases.