This is the 0.3.10 release, a follow-up to 0.3.9 focused on stability and post-release bug fixes. For the new-feature lineup, see the 0.3.9 release notes. Thanks to everyone who filed issues and sent fixes since 0.3.9 shipped. If you hit a bug, please open an issue.
Bug Fixes
- OOM under sustained load:
OMLX_MAX_PROCESS_MEMORYwasn't actually being enforced on batched / VLM engines, and finished requests piled up KV caches in an unbounded SSD-write queue. The two together could push memory past the cap and get the server killed. Both are fixed, and inflight load now scales with memory pressure (#1383). - OpenClaw / Codex getting empty replies: tag-free output from non-reasoning Qwen / Llama models was misclassified as thinking and dropped. It's now treated as content (#1348).
- Native MTP crash on Qwen3.6 / Qwopus3.6 derivatives: MTP-quantized variants crashed with
speculative_call() got unexpected keyword argument 'n_confirmed'after a dflash hot-swap. MTP patches now self-heal on every engine start and the dflash hook stays wrapped for the session (#1388). - Late aborts during engine teardown: cancelling right as the engine unloaded printed a traceback and could stick a request slot. Now absorbed cleanly (#1389, thanks @glasses666).
- DFlash dropping images from multimodal requests: content was flattened before the image check, so image + text requests went down the text-only path. Multimodal content is now detected first and routed to the VLM fallback (#1344, thanks @ivaniguarans).
omlx launchbreaking Codex on the macOS DMG: the bundled-Python env vars (PYTHONHOME/PYTHONPATH) leaked into the launched agent and confused its venv. Env scrub now runs for every launch target, not justclaude(#1350).- oQ-quantized VLMs loading as text-only:
processor_config.jsonwasn't being copied to the output, so the artifact lost its vision capability. Now copied through (#1386, thanks @a4501150). - Tool-calling fixes from @Mearman:
tools: []was treated astools: None, ignoring clients that explicitly disable tools. And the thinking-model tool-call extractor dropped real tool calls when the model added a note after</think>, because the filter was text-shape based instead of name-matching (#1392, #1393). - Anthropic streaming tool-call index off: when a streaming response had thinking followed by a tool call, the tool block's
indexwas wrong and broke client-side assembly. Indices are now sequential (#1356, thanks @lvsijian8). - HF download cancel not stopping Xet repos: cancel was a no-op on Xet-backed downloads. Xet is now disabled in the downloader so cancel works again.
- ModelScope recommended cards missing params / size: the recommended row showed those fields for HF cards but not ModelScope, so they couldn't be compared. Both fields are now fetched for ModelScope too (#1351, thanks @popfido).
- Dashboard charts filling with empty points after idle: idle dashboards kept appending null points to the timeline. Series now expire after an idle TTL (#1349, thanks @imi4u36d).
- Browse Models name column too narrow: long names were chopped with no way to see the rest. The column is now wider, with a hover tooltip for the full name (#1369).
New Contributors
Thank you to everyone making their first contribution in 0.3.10: