jundot/omlx v0.3.5.dev1 on GitHub

Dev release with Gemma 4 native tool calling, UI improvements, and several bug fixes. This is a dev build and may contain bugs. If you run into any issues, please open an issue.

Highlights

Gemma 4 native tool calling

Bumped mlx-lm to dcbf6e3 and mlx-vlm to 23e1dff. mlx-lm now ships a native Gemma 4 tool call parser (<|tool_call> / <tool_call|>) and multi-token think/tool token support. omlx's existing parse_tool_calls() picks up the new parser automatically — no server-side special-casing needed.

Removed the Gemma 4 multi-image vision monkey patch since mlx-vlm handles different-sized images natively now. The batched decode patch stays for now (mlx-vlm still uses cache.state based KV sharing).

Auto theme mode

System appearance sync with auto/light/dark theme picker in the admin UI. by @Stv-X (#621, #624)

New Features

Reproducible generation via seed parameter (#640)
6-bit and 8-bit TurboQuant options (#594)
Skip admin auth when skip_api_key_verification is enabled on localhost (#587)
Unify max_num_seqs and completion_batch_size into max_concurrent_requests

Bug Fixes

Fix TurboQuant SSD cache reconstruction crash (#577)
Fix oQ: skip quantizing modules without to_quantized() (#625)
Fix oQ: unwrap tuple layer outputs in sensitivity measurement (#627)
Fix dark mode heading visibility and mobile UX in chat page (#586)
Fix navbar theme picker width clipping
Fix Homebrew formula sha256 for v0.3.4 (#589)
Fix trigger formula update on release publish instead of tag push
Bundle spacy en_core_web_sm for Kokoro TTS in DMG build (#590)
Add missing TTS/STT/STS deps to audio extra (#590)

Dependencies

Bump mlx-lm to dcbf6e3 (Gemma 4 tool call parser, multi-token think/tool)
Bump mlx-vlm to 23e1dff (Gemma 4 multi-image fix, nested tool parser)
Add regex dependency

New Contributors

@Stv-X — Auto theme mode and settings UI (#621, #624)

Full changelog: v0.3.4...v0.3.5.dev1