github jundot/omlx v0.4.0rc2

5 hours ago

0.4.0rc2 is the second release candidate for the native Swift macOS app. The old PyObjC menubar app has been retired, and the macOS bundle now ships as a Swift app with a redesigned onboarding flow, settings UI, status surfaces, model management, and GitHub Releases based updater.

This Swift transition was driven by excellent work from @popfido, with follow-up polish and release-path fixes folded in after the initial merge. Thank you for the huge amount of thoughtful work here — this is the biggest user-facing macOS change oMLX has shipped so far, and it substantially raises the quality of the desktop app.

oMLX 0.4.0rc2 dark mode screenshotoMLX 0.4.0rc2 light mode screenshot

Highlights

  • Native Swift macOS app. The old PyObjC menubar app has been replaced by a native Swift/SwiftUI app, with new onboarding, settings, status, model management, downloads, integrations, and update flows. by @popfido
  • Improved menubar and app status. Live port/status updates, StatusKit fixes, version display, and cleaner running-state behavior. by @popfido
  • Browser chat UI received a major usability overhaul and follow-up message/action fixes. by @beamivalice
  • xgrammar is bundled into the venvstacks export with the no-torch stub path. by @cfbraun
  • Memory guard tuning relaxed throttle/eviction thresholds and improved Custom tier behavior.

Changes since 0.4.0rc1

  • Model directory management in the macOS app. The Swift app now has model directory management so users can adjust storage paths directly from the app surface.
  • macOS update flow fixes. The update path was tightened after rc1, including preserving canonical host settings, improving the update flow, restoring the CLI launch shim, and signing the macOS CLI wrapper.
  • Light mode settings contrast restored. Settings screens are readable again in the light appearance.
  • Wildcard bind addresses now normalize for client connections. 0.0.0.0 style bind addresses are normalized to a usable local client address. by @monroewilliams
  • Tool call function names are normalized without weakening validation. Function names are trimmed while preserving the expected type validation behavior. by @palvaleri
  • Top-level imports are lazy-loaded. Heavy top-level omlx imports are deferred to improve startup compatibility, including NumPy 2.x environments. by @fparrav
  • Engine stop yields back to the event loop. The server now yields after engine stop so shutdown/restart paths do not monopolize the event loop. by @fqx
  • Admin custom-tier memory text was synced with server behavior. The displayed reserve/comment now matches the actual Custom tier behavior.
  • Formula automation was corrected. The formula URL substitution workflow was fixed for release automation.

Runtime, cache, and scheduler

  • Per-engine MLX threads eliminate cross-engine stream contamination. by @ivaniguarans
  • Store-cache and boundary snapshot paths now materialize lazy arrays on the owning thread before async byte extraction. by @aeyeopsdev
  • Boundary snapshot cleanup races and stale snapshot handling were fixed. by @cfbraun
  • Predictive prefill throttling and reclaim/requeue behavior reduce mid-stream OOM failures. by @sdiamanEXUS
  • Paged cache references are released correctly on preflight/prefill rejection paths. by @cfbraun
  • VLM, SpecPrefill, and draft-model lazy state is materialized on loader threads to avoid stream errors. by @cfbraun

MTP, oQ, TurboQuant, and model compatibility

  • Safe row-wise MTP decoding is enabled for aligned batches, with fallback for unsafe late-join batches.
  • Qwen3.6 MXFP4 mixed norm conventions and MTP preservation are handled more safely. by @scubamount
  • TurboQuant now supports batched KV-cache compression and fixes batch merge edge cases. by @popfido
  • DFlash/MTP transition restores Qwen GQA attention hooks.
  • LFM text MoE model discovery is classified correctly as LLM instead of mlx-audio STS. by @samfenwick

API and integrations

  • Guided grammar is now exposed as a model setting and maps into the existing structured-output grammar path. by @MrNiceRicee
  • Anthropic cache-control accounting and model context length reporting were fixed. by @richgoodson
  • Claude Code compatibility was updated for newer request behavior. by @lx1229
  • CLI shutdown handles KeyboardInterrupt cleanly. by @fry69
  • Integration launch context was unified across external tool integrations.

Admin UI and macOS UI

  • Downloads now include a model card sheet with metadata, files, and tags. by @popfido
  • Local Models sorting is now case-insensitive ascending. by @MwC-Trexx
  • Active Models layout works better on narrow screens. by @samfenwick
  • Model settings table headers are aligned. by @ilukashin
  • Server/app settings apply behavior and live port display were cleaned up. by @popfido

Packaging, CI, and tests

  • The venvstacks driver is pinned/detected more reproducibly. by @popfido
  • The mlx-framework venvstacks layer was renamed to mlx-base. by @popfido
  • CI workflow and broader unit-test coverage were added. by @Mearman, @cfbraun, @fry69
  • Python 3.14 was added to the CI matrix. by @fry69
  • paroquant dev dependency was bumped to 0.1.15.

New Contributors

Thank you to everyone making their first contribution in this release:

@cfbraun, @chenqianhe, @jcalvert, @MwC-Trexx, @azhangd, @scubamount, @sdiamanEXUS, @ilukashin, @tylerliu, @MrNiceRicee, @lx1229, @palvaleri, @monroewilliams.

Don't miss a new omlx release

NewReleases is sending notifications on new releases.