github jundot/omlx v0.2.14

latest releases: v0.3.8, v0.3.8rc1, v0.3.8rc2...
one month ago

Download the DMG that matches your macOS version (sequoia or tahoe).
If you're on an M5 Mac, you must use the macos26-tahoe DMG for M5 Neural Accelerator.

New Models

  • Qwen3-VL embedding and reranking support (via mlx-embeddings 6e2ef52)
  • Moondream3 vision-language model support (via mlx-vlm b7f853a)

Bug Fixes

  • fix respect per-model max_tokens settings (#258)
  • fix download stall timeout too short (120s → 300s) (#254)
  • fix Qwen3.5 batch dimension mismatches under continuous batching (upstream mlx-vlm db3d558)
  • fix Qwen3-VL attention mask slicing with mx.array kv_seq_len (upstream mlx-vlm)
  • fix Qwen3-Omni integration (upstream mlx-vlm b7f853a)

Dependency Updates

  • mlx >=0.29.2>=0.31.1
  • mlx-embeddings 88522e26e2ef52
  • mlx-vlm 348466f (0.3.13) → b7f853a (0.4.0)

full changelog: v0.2.13...v0.2.14

Don't miss a new omlx release

NewReleases is sending notifications on new releases.