github jundot/omlx v0.3.5.dev1

6 hours ago

Dev release with Gemma 4 native tool calling, UI improvements, and several bug fixes. This is a dev build and may contain bugs. If you run into any issues, please open an issue.

Highlights

Gemma 4 native tool calling

Bumped mlx-lm to dcbf6e3 and mlx-vlm to 23e1dff. mlx-lm now ships a native Gemma 4 tool call parser (<|tool_call> / <tool_call|>) and multi-token think/tool token support. omlx's existing parse_tool_calls() picks up the new parser automatically — no server-side special-casing needed.

Removed the Gemma 4 multi-image vision monkey patch since mlx-vlm handles different-sized images natively now. The batched decode patch stays for now (mlx-vlm still uses cache.state based KV sharing).

Auto theme mode

System appearance sync with auto/light/dark theme picker in the admin UI. by @Stv-X (#621, #624)

New Features

  • Reproducible generation via seed parameter (#640)
  • 6-bit and 8-bit TurboQuant options (#594)
  • Skip admin auth when skip_api_key_verification is enabled on localhost (#587)
  • Unify max_num_seqs and completion_batch_size into max_concurrent_requests

Bug Fixes

  • Fix TurboQuant SSD cache reconstruction crash (#577)
  • Fix oQ: skip quantizing modules without to_quantized() (#625)
  • Fix oQ: unwrap tuple layer outputs in sensitivity measurement (#627)
  • Fix dark mode heading visibility and mobile UX in chat page (#586)
  • Fix navbar theme picker width clipping
  • Fix Homebrew formula sha256 for v0.3.4 (#589)
  • Fix trigger formula update on release publish instead of tag push
  • Bundle spacy en_core_web_sm for Kokoro TTS in DMG build (#590)
  • Add missing TTS/STT/STS deps to audio extra (#590)

Dependencies

  • Bump mlx-lm to dcbf6e3 (Gemma 4 tool call parser, multi-token think/tool)
  • Bump mlx-vlm to 23e1dff (Gemma 4 multi-image fix, nested tool parser)
  • Add regex dependency

New Contributors

Full changelog: v0.3.4...v0.3.5.dev1

Don't miss a new omlx release

NewReleases is sending notifications on new releases.