What's Changed
- several optimizations for bringing down GPU usage (#738) by @RaajeevChandran
- perf: two-phase prefill for hybrid SSM/transformer models (Qwen3.5) (#737) by @vernonstinebaker
- perf: tiered KV cache store, O(1) streaming decode, eval fix (#736) by @vernonstinebaker
- perf: memoize preflight capability search results per session (#734) by @vernonstinebaker
- perf: sort alwaysLoadedSpecs() for deterministic tool ordering (#733) by @vernonstinebaker
- chore: bump swift-jinja 2.3.2 → 2.3.3 (#732) by @vernonstinebaker
- fix(tests): update PreflightCapabilitySearch topKValues expectations after #730 (#731) by @vernonstinebaker
🐛 Bug Fixes
🧰 Maintenance
Full Changelog: 0.15.6...0.15.7