github Blaizzy/mlx-vlm v0.3.0

latest releases: v0.3.5, v0.3.4, v0.3.3...
4 months ago

What's Changed

  • [gemma3n] Correctly scale text embeddings for quantized gemma3n conversions by @neilmehta24 in #397
  • smolvlm_video example: fix typo in system prompt by @pcuenca in #389
  • Fix gemma3n pixel casting by @Blaizzy in #398
  • Fix audio model check and prompt utils by @Blaizzy in #395
  • Add KV Quantization by @Blaizzy in #401
  • Fix Gemma3n multi-task merging and update LM by @Blaizzy in #405
  • [gemma3n] Fix vision encoder implementation of EdgeResidual and UniversalInvertedResidual by @neilmehta24 in #410
  • fix: Remove unnecessary unicode_escape decoding for Chinese text input by @nicekate in #403
  • Add support for Mixed Quant by @Blaizzy in #413
  • Fix gemma3n Vision OCR + LM only reponses by @Blaizzy in #414
  • Fix generate signature by @Blaizzy in #416
  • Add support for audio modality in server by @Blaizzy in #417
  • Update server, readme and misc by @Blaizzy in #418

New Contributors

Full Changelog: v0.2.0...v0.3.0

Don't miss a new mlx-vlm release

NewReleases is sending notifications on new releases.