jundot/omlx v0.4.0 on GitHub

0.4.0 is the first official release of the native Swift macOS app. The old PyObjC menubar app has been retired, and the macOS bundle now ships as a Swift app with a redesigned onboarding flow, settings UI, status surfaces, model management, and GitHub Releases based updater.

This Swift transition was driven by excellent work from @popfido, with follow-up polish and release-path fixes folded in after the initial merge. Thank you for the huge amount of thoughtful work here — this is the biggest user-facing macOS change oMLX has shipped so far, and it substantially raises the quality of the desktop app.

Highlights

Native Swift macOS app. The old PyObjC menubar app has been replaced by a native Swift/SwiftUI app, with new onboarding, settings, status, model management, downloads, integrations, and update flows. by @popfido
Improved menubar and app status. Live port/status updates, StatusKit fixes, version display, supervised-server handling, and cleaner running-state behavior. by @popfido
Standard Hugging Face cache model directory support. oMLX can now discover models from the standard Hugging Face cache location, with controls for toggling HF cache discovery and managing local model directories.
Safer update flow. App updates now honor the selected update channel and require confirmation before download.
Browser chat UI received a major usability overhaul and follow-up message/action fixes. by @beamivalice
xgrammar is bundled into the venvstacks export with the no-torch stub path. by @cfbraun
Memory guard tuning relaxed throttle/eviction thresholds, improved Custom tier behavior, and added CLI options for memory guard configuration.

Runtime, cache, and scheduler

Per-engine MLX threads eliminate cross-engine stream contamination. by @ivaniguarans
Store-cache and boundary snapshot paths now materialize lazy arrays on the owning thread before async byte extraction. by @aeyeopsdev
Boundary snapshot cleanup races and stale snapshot handling were fixed. by @cfbraun
Predictive prefill throttling and reclaim/requeue behavior reduce mid-stream OOM failures. by @sdiamanEXUS
Paged cache references are released correctly on preflight/prefill rejection paths. by @cfbraun
Paged cache now disables itself cleanly when SSD initialization fails instead of breaking startup. by @lvsijian8
VLM, SpecPrefill, and draft-model lazy state is materialized on loader threads to avoid stream errors. by @cfbraun
Engine stop now yields back to the event loop so shutdown/restart paths do not monopolize the loop. by @fqx
Unreadable model directories are handled during startup instead of aborting discovery.
DMG builds now preserve engine commit metadata.

MTP, oQ, TurboQuant, and model compatibility

Safe row-wise MTP decoding is enabled for aligned batches, with fallback for unsafe late-join batches.
Qwen3.6 MXFP4 mixed norm conventions and MTP preservation are handled more safely. by @scubamount
TurboQuant now supports batched KV-cache compression and fixes batch merge edge cases. by @popfido
DFlash/MTP transition restores Qwen GQA attention hooks.
LFM text MoE model discovery is classified correctly as LLM instead of mlx-audio STS. by @samfenwick
Step 3.7 Flash support is patched through the mlx-lm compatibility path.

API and integrations

Guided grammar is now exposed as a model setting and maps into the existing structured-output grammar path. by @MrNiceRicee
Anthropic cache-control accounting and model context length reporting were fixed. by @richgoodson
tool_choice: "none" is respected for MCP tools. by @lvsijian8
Tool call function names are trimmed while preserving type validation behavior. by @palvaleri
Wildcard bind addresses such as 0.0.0.0 are normalized to usable local client addresses. by @monroewilliams
Top-level omlx imports are lazy-loaded to improve startup compatibility, including NumPy 2.x environments. by @fparrav
Claude Code compatibility was updated for newer request behavior. by @lx1229
CLI shutdown handles KeyboardInterrupt cleanly. by @fry69
Integration launch context was unified across external tool integrations.

Admin UI and macOS UI

Downloads now include a model card sheet with metadata, files, and tags. by @popfido
Local Models sorting is now case-insensitive ascending. by @MwC-Trexx
SwiftUI model lists now also sort case-insensitively.
Active Models layout works better on narrow screens. by @samfenwick
Model settings table headers are aligned. by @ilukashin
Server/app settings apply behavior and live port display were cleaned up. by @popfido
Light mode settings contrast was restored.
Mac app CLI launch shim and CLI wrapper signing were restored.
Admin custom-tier memory text is synced with server behavior.

Packaging, CI, and tests

The venvstacks driver is pinned/detected more reproducibly. by @popfido
The mlx-framework venvstacks layer was renamed to mlx-base. by @popfido
CI workflow and broader unit-test coverage were added. by @Mearman, @cfbraun, @fry69
Python 3.14 was added to the CI matrix. by @fry69
Formula automation and release URL substitution were corrected.
paroquant dev dependency was bumped to 0.1.15.

New Contributors

Thank you to everyone making their first contribution in this release:

@cfbraun, @chenqianhe, @jcalvert, @MwC-Trexx, @azhangd, @scubamount, @sdiamanEXUS, @ilukashin, @tylerliu, @MrNiceRicee, @lx1229, @palvaleri, @monroewilliams.