Download the DMG that matches your macOS version (sequoia or tahoe).
If you're on an M5 Mac, you must use themacos26-tahoeDMG for M5 Neural Accelerator.
Critical Bug Fixes
Agent session stalls in long tool-calling sessions (#205)
- fixed OpenCode and Claude Code losing tool call structure over multiple rounds, causing the model to silently stop generating
- fixed Codex sessions losing conversation history after a few rounds due to incomplete
previous_response_idchain restoration - added persistent response state storage so Codex sessions survive server restarts
Multiple model directories lost on server restart
- fixed the macOS app overwriting multi-directory settings on every launch, keeping only the first directory
New Features
Model fallback to default (#207)
- added
model_fallbacksetting. When enabled, if a client requests a model that isn't available, the server falls back to the default model instead of returning an error. Useful for setups where clients hardcode model names.
Status endpoint for statusline integration (#163)
- added
GET /api/statusendpoint returning server state, loaded model info, and resource usage in a compact format. Designed for editor statusline plugins and monitoring scripts.
Benchmark prompt length options
- added 131072 and 200000 token prompt lengths to the benchmark tool for testing long-context performance.
Build info and OS-aware update links
- added build number display in the about dialog with macOS codename tagging
- added OS-aware DMG selection in the auto-update flow so sequoia and tahoe users get the right build automatically
Bug Fixes
Missing tiktoken dependency in bundled app (#213)
- fixed models requiring tiktoken (e.g. certain Qwen variants) failing to load in the DMG build because tiktoken wasn't included in the bundled python environment.
System messages breaking strict chat templates
- fixed multiple system/developer messages scattered through conversation history causing failures with models that expect a single system message at the front. System messages are now consolidated.
Usage response missing standard token count fields (#194)
- fixed
input_tokensandoutput_tokensaliases missing from the usage response object. Clients like OpenClaw that expect these fields (instead ofprompt_tokens/completion_tokens) can now track context usage properly.
full changelog: v0.2.10...v0.2.11