Highlights
Per-model chat_template_kwargs support
You can now configure chat_template_kwargs per model directly from the admin settings panel. This means you can freely adjust reasoning effort for gpt-oss models, toggle enable_thinking true/false for the new Qwen 3.5 models, and more — all without touching config files or restarting the server. Get the most out of your MLX models with just a few clicks.
What's New
Features
- Add per-model and per-request
chat_template_kwargssupport - Show
chat_template_kwargsbadges in model list - Add Homebrew formula for CLI installation (#38)
