What's New
Features
- HuggingFace mirror endpoint support: configure a custom HF mirror endpoint for regions with restricted access to huggingface.co. applies to model downloads, search, and all Hub API calls. (#116)
- Dashboard tab persistence: selected dashboard tabs are now persisted in URL query params, so refreshing the page or sharing a link keeps your current view. (#129)
- Extended metrics reference: batch size, speedup ratio, and per-request prefill TPS added to the metrics reference panel. (#101)
- mlx-lm upgraded to v0.31.1: updated to commit 4a21ffd for latest model support and bug fixes.
Bug Fixes
- Streaming with tool calls: content is now streamed token-by-token even when tools are present, instead of buffering the entire response. (#103)
- Model alias settings lookup: per-model settings (temperature, max tokens, etc.) now correctly resolve model aliases before lookup. (#117)
- Cache corruption infinite loop: cache corruption during prefill no longer causes an infinite retry loop. the corrupted cache is cleared and prefill restarts cleanly.
- Requests dict leak on cache failure:
fail_all_requestsno longer triggers a full cache reset, and properly cleans up the requests dictionary. - HuggingFace API timeouts: added timeouts to all HuggingFace Hub API calls to prevent the server from freezing when HF is unreachable. (#124)
- Qwen3/Gemma3 misidentified as embedding models: LLMs with certain architectures were incorrectly classified as embedding models. (#130)
- macOS 15.0+ requirement enforced: MLX >= 0.29.2 requires macOS 15.0 (Sequoia). the app now checks and enforces this at startup. (#125)
- i18n language setting not persisting: language setting selected before server init was lost after initialization. (#119)
- Anthropic tool-call filtering: added fallback safety for edge cases in Anthropic adapter tool-call handling.
Documentation
- Multilingual README: added Chinese, Korean, and Japanese translations.
New Contributors
- @TipKnuckle made their first contribution in #103
- @jonsnowljs made their first contribution in #129
Thanks to @TipKnuckle, @jonsnowljs, and @rsnow for their contributions!