github jundot/omlx v0.2.4

latest releases: v0.3.8.dev1, v0.3.7, v0.3.7rc2...
one month ago

What's New

Features

  • Skip API key verification (localhost): when the server is bound to localhost, you can now disable API key verification for all API endpoints from global settings. makes local-only workflows frictionless, no more dummy keys needed. the option automatically resets when switching to a public host. (#92)
  • Model alias: set a custom API-visible name for any model via the model settings modal. /v1/models returns the alias instead of the directory name, and requests accept both the alias and the original name. useful when switching between inference providers without reconfiguring clients. (#92)
  • Version display: the CLI now shows the version in the startup banner, and the admin navbar displays the running version. (#90)

Bug Fixes

  • Loaded model lost after re-discovery: deleting a model or changing settings triggered model re-discovery, which dropped already-loaded engines from the pool. loaded models now preserve their runtime state across re-discovery. (#89)
  • Text-only VLM quant misdetection: text-only quantizations of natively multimodal models (e.g. Qwen 3.5 122B converted via mlx_lm.convert) were misdetected as VLM, causing a failed load attempt on every restart. now correctly classified as LLM when vision_config is absent. (#84)
  • SSD cache utilization over 100%: cache utilization could exceed 100% when available disk space shrank after initial calculation. now clamped properly.
  • Reasoning model output token caching: output tokens from reasoning models (with <think> tags) were being cached unnecessarily. now skipped to avoid polluting the prefix cache.

UI Improvements

  • Model settings modal reordered: alias / model type / ctx window / max tokens / temperature / top p / top k / rep. penalty / ttl / load defaults
  • Alias badge shown next to model name in both model settings list and model manager

New Contributors

  • @rsnow made their first contribution in #84

Thanks to @rsnow for the contribution!

Don't miss a new omlx release

NewReleases is sending notifications on new releases.