jundot/omlx v0.1.1
oMLX v0.1.1

on GitHub

latest releases: v0.3.4, v0.3.3, v0.3.2...

one month ago

What's New

Two-level model directory scanning (#1)

Support for organization folder layouts (e.g., mlx-community/llama-3b/)
Flat and two-level directories can coexist in the same model directory

Streaming tool call parsing (#2)

Stream tool calls in OpenAI-compatible format
XML fallback parser for GLM/Qwen/Llama models without native tool call support
Content buffering prevents duplicate tool call output

Client disconnect detection (#3)

Streaming responses now detect client disconnects via ASGI
Proper cleanup of async generators and pending tasks on disconnect

KV cache headroom & manual model unload (#4)

25% KV cache headroom during model loading for better multi-model memory management
Manual model unload via POST /v1/models/{model_id}/unload and admin panel

New Contributors

@thornad made their first contribution in #1, #2, #3, #4

Thanks to @thornad for all four PRs in this release!

Check out latest releases or
releases around jundot/omlx v0.1.1

Don't miss a new omlx release

NewReleases is sending notifications on new releases.

Get notifications