github jundot/omlx v0.2.11

latest releases: v0.3.8, v0.3.8rc1, v0.3.8rc2...
one month ago

Download the DMG that matches your macOS version (sequoia or tahoe).
If you're on an M5 Mac, you must use the macos26-tahoe DMG for M5 Neural Accelerator.

Critical Bug Fixes

Agent session stalls in long tool-calling sessions (#205)

  • fixed OpenCode and Claude Code losing tool call structure over multiple rounds, causing the model to silently stop generating
  • fixed Codex sessions losing conversation history after a few rounds due to incomplete previous_response_id chain restoration
  • added persistent response state storage so Codex sessions survive server restarts

Multiple model directories lost on server restart

  • fixed the macOS app overwriting multi-directory settings on every launch, keeping only the first directory

New Features

Model fallback to default (#207)

  • added model_fallback setting. When enabled, if a client requests a model that isn't available, the server falls back to the default model instead of returning an error. Useful for setups where clients hardcode model names.

Status endpoint for statusline integration (#163)

  • added GET /api/status endpoint returning server state, loaded model info, and resource usage in a compact format. Designed for editor statusline plugins and monitoring scripts.

Benchmark prompt length options

  • added 131072 and 200000 token prompt lengths to the benchmark tool for testing long-context performance.

Build info and OS-aware update links

  • added build number display in the about dialog with macOS codename tagging
  • added OS-aware DMG selection in the auto-update flow so sequoia and tahoe users get the right build automatically

Bug Fixes

Missing tiktoken dependency in bundled app (#213)

  • fixed models requiring tiktoken (e.g. certain Qwen variants) failing to load in the DMG build because tiktoken wasn't included in the bundled python environment.

System messages breaking strict chat templates

  • fixed multiple system/developer messages scattered through conversation history causing failures with models that expect a single system message at the front. System messages are now consolidated.

Usage response missing standard token count fields (#194)

  • fixed input_tokens and output_tokens aliases missing from the usage response object. Clients like OpenClaw that expect these fields (instead of prompt_tokens/completion_tokens) can now track context usage properly.

full changelog: v0.2.10...v0.2.11

Don't miss a new omlx release

NewReleases is sending notifications on new releases.