github Blaizzy/mlx-vlm v0.4.3

9 hours ago

What's Changed

  • Add SAM 3.1 with Object Multiplex and optimized realtime pipeline by @Blaizzy in #880
  • Add Falcon-OCR model support by @Griffintaur in #879
  • Add RF-DETR detection and segmentation model by @Blaizzy in #884
  • Add granite vision 3.2 and 4.0 by @Blaizzy in #885
  • Add wired_limit to vision model inference pipelines for CUDA support by @Blaizzy in #887
  • Add falcon perception by @Blaizzy in #888
  • fix(server): disable uvicorn reload by default to prevent memory leaks by @futurepitcher in #883
  • Add Gemma 4 model support (vision, audio, MoE) by @Blaizzy in #890
  • Fix Gemma 4 embedding scaling by @Blaizzy in #893
  • Add Turbo Quant by @Blaizzy in #858
  • Remove TurboQuant benchmark artifacts and add README docs by @Blaizzy in #894

New Contributors

Full Changelog: v0.4.2...v0.4.3

Don't miss a new mlx-vlm release

NewReleases is sending notifications on new releases.