Breaking Changes
Beautiful Benchmark Tool
One-click performance benchmarking right from the Admin Panel — easily measure the performance of any model you want to use!
What's New
Benchmark
- Add built-in benchmark tool to the admin panel with prefill (PP) and text generation (TG) TPS metrics
- Support partial prefix cache hit measurement for realistic benchmarking
- Add text export for benchmark results
Admin panel
- Split monolithic dashboard into tab-based partials for better maintainability
- Vendor all CDN dependencies for fully offline admin panel support
Performance
- Optimize scheduler hot path for improved TPS throughput
- Add async background write for SSD paged cache to reduce write latency
Memory management
- Add process-level memory enforcement (Memory Limit Total) to prevent system-wide OOM
Multiple model directories
- Support multiple model directories for organizing models across different paths
SSD cache
- Fix
CacheListsub_meta_statessanitization for correct KVCache reconstruction
Engine
- Upgrade mlx-lm to commit
179da77
