github jundot/omlx v0.4.0rc1

7 hours ago

0.4.0rc1 is the first release candidate for the native Swift macOS app. The old PyObjC menubar app has been retired, and the macOS bundle now ships as a Swift app with a redesigned onboarding flow, settings UI, status surfaces, model management, and GitHub Releases based updater.

This Swift transition was driven by excellent work from @popfido, with follow-up polish and release-path fixes folded in after the initial merge. Thank you for the huge amount of thoughtful work here — this is the biggest user-facing macOS change oMLX has shipped so far, and it substantially raises the quality of the desktop app.

oMLX native Swift macOS app screenshot

Highlights

  • Native Swift macOS app. The old PyObjC menubar app has been replaced by a native Swift/SwiftUI app, with new onboarding, settings, status, model management, downloads, integrations, and update flows. by @popfido
  • Improved menubar and app status. Live port/status updates, StatusKit fixes, version display, and cleaner running-state behavior. by @popfido
  • Browser chat UI received a major usability overhaul and follow-up message/action fixes. by @beamivalice
  • xgrammar is bundled into the venvstacks export with the no-torch stub path. by @cfbraun
  • Memory guard tuning relaxed throttle/eviction thresholds and improved Custom tier behavior.

Runtime, cache, and scheduler

  • Per-engine MLX threads eliminate cross-engine stream contamination. by @ivaniguarans
  • Store-cache and boundary snapshot paths now materialize lazy arrays on the owning thread before async byte extraction. by @aeyeopsdev
  • Boundary snapshot cleanup races and stale snapshot handling were fixed. by @cfbraun
  • Predictive prefill throttling and reclaim/requeue behavior reduce mid-stream OOM failures. by @sdiamanEXUS
  • Paged cache references are released correctly on preflight/prefill rejection paths. by @cfbraun
  • VLM, SpecPrefill, and draft-model lazy state is materialized on loader threads to avoid stream errors. by @cfbraun

MTP, oQ, TurboQuant, and model compatibility

  • Safe row-wise MTP decoding is enabled for aligned batches, with fallback for unsafe late-join batches.
  • Qwen3.6 MXFP4 mixed norm conventions and MTP preservation are handled more safely. by @scubamount
  • TurboQuant now supports batched KV-cache compression and fixes batch merge edge cases. by @popfido
  • DFlash/MTP transition restores Qwen GQA attention hooks.
  • LFM text MoE model discovery is classified correctly as LLM instead of mlx-audio STS. by @samfenwick

API and integrations

  • Guided grammar is now exposed as a model setting and maps into the existing structured-output grammar path. by @MrNiceRicee
  • Anthropic cache-control accounting and model context length reporting were fixed. by @richgoodson
  • Claude Code compatibility was updated for newer request behavior. by @lx1229
  • CLI shutdown handles KeyboardInterrupt cleanly. by @fry69
  • Integration launch context was unified across external tool integrations.

Admin UI and macOS UI

  • Downloads now include a model card sheet with metadata, files, and tags. by @popfido
  • Local Models sorting is now case-insensitive ascending. by @MwC-Trexx
  • Active Models layout works better on narrow screens. by @samfenwick
  • Model settings table headers are aligned. by @ilukashin
  • Server/app settings apply behavior and live port display were cleaned up. by @popfido

Packaging, CI, and tests

  • The venvstacks driver is pinned/detected more reproducibly. by @popfido
  • The mlx-framework venvstacks layer was renamed to mlx-base. by @popfido
  • CI workflow and broader unit-test coverage were added. by @Mearman, @cfbraun, @fry69
  • Python 3.14 was added to the CI matrix. by @fry69
  • paroquant dev dependency was bumped to 0.1.15.

New Contributors

Thank you to everyone making their first contribution in this release:

@cfbraun, @chenqianhe, @jcalvert, @MwC-Trexx, @azhangd, @scubamount, @sdiamanEXUS, @ilukashin, @tylerliu, @MrNiceRicee, @lx1229.

Don't miss a new omlx release

NewReleases is sending notifications on new releases.