Token Optimizer v5.11.2
New: Cache Health watchdog
See exactly what prompt-cache expiry costs you. The watchdog reads your own session history, finds the moments your cache expired (long pauses → the API re-charges you to rebuild the prefix), and prices the waste at each session's real model rates. Cache economics are provider-aware — Anthropic explicit TTL, OpenAI/DeepSeek automatic discounts, Gemini explicit caching — so every model your sessions touch gets the right math and a verified, provider-specific remedy. Claude Code sessions are modeled with the 1-hour cache the platform actually holds.
- Dashboard: new Cache Health panel (opportunity-tier — never mixed into realized savings)
- CLI:
measure.py cache-report [--days N] [--json] [--fresh] [--verbose]with per-session drill-down - Wired for Claude Code and Codex; platforms without per-turn cache visibility are listed as explicit coverage gaps
Compression: more coverage, on by default
First-read skeleton compression expands to six cohorts, validated against real session history (thousands of replayed sessions) instead of long shadow phases. A runtime quality tripwire watches every active cohort's live edit-after-read rate and auto-demotes anything that degrades — with sticky hysteresis, schema-validated state, and a clear notice when it acts. Agent-result compression ships measure-only: its measured harm proxy is above our safety gate, so it reports its opportunity without touching your output.
Improved
- Per-session cache-waste drill-down for verifying your own numbers
- Faster dashboard renders via smarter result caching
- Honest
cohortsCLI for inspecting and re-promoting compression cohorts