0.25.0 (2026-06-12)
Features
- add differential network capture harness (#761) (11ab5f8)
- add light mode for dashboard (#834) (c425893)
- add OAuth2 client-credentials upstream-auth proxy extension (#778) (#784) (eb2e50f)
- add Vertex AI proxy routing (#793) (3c77e52)
- cli: comprehensive help text, validation, and exception handling improvements (#640) (028efab)
- compression safety rails — error-output protection, pipeline circuit breaker, library inflation guard (#851) (c0cadcc)
- dashboard: per-model savings breakdown and expected-vs-actual cost on historical charts (#807) (34dafe6)
- detect re-served tool results as over-compression waste signal (#854) (5f1d88a)
- evals: add zero-cost tool schema compaction integrity eval (#817) (53a08c6)
- gated Markdown-KV compaction formatter (serialization-aware output) (#859) (06b2625)
- kompress: warn on unrecognized HEADROOM_KOMPRESS_BACKEND + document backend selection (#204) (6367d0b)
- memory: add opt-in Apple-GPU (MPS) embedding runtime (#766) (c71592d)
- net-cost cache mutation formula on CompressionPolicy (#856 P1) (#857) (d5f5802)
- plugins: Hermes agent headroom_retrieve plugin (#824) (058bced)
- probe-based retention scoring of recorded compression events (#862) (c2106cb)
- proxy: add CLI opt-outs for CCR injection (compression-only mode) (#823) (693d9d2)
- proxy: attribute savings history rollups per provider (#791) (0b8b8d9)
- proxy: log compressed messages alongside original request (#261) (2269e40)
- proxy: per-project savings breakdown on the dashboard (claude, codex, aider, copilot, cursor) (#803) (914a60a)
- support Python 3.14+ via pyo3 abi3 stable ABI (#516) (19eac8e)
- switch Kompress default to kompress-v2-base with weight-only int8 ONNX (#799) (74392b2)
- transforms: attribute read_lifecycle + smart_crush tags (#249) (8f37426)
Bug Fixes
- anthropic: CCR exception must re-raise, not silently swallow (#838) (8db5efc)
- ccr: key Rust search/diff/log markers with explicit_hash (#852) (bfcb07d)
- ccr: make retrieval TTL configurable (#715) (2533f77)
- ccr: skip CCR when model calls headroom_retrieve alongside user tools (#839) (30078f8)
- ccr: use shared compression store (#875) (249af6c)
- ci: correct comments, timeouts, and pip reliability in native e2e workflows (#878) (b716c8c)
- ci: pin cosign-installer to v3 (v4 does not exist) (#774) (199d693)
- codex: respect CODEX_HOME for wrap config (#731) (96abf38)
- content_router: guard against empty compression output causing Anthropic 400 (#771) (2f9ff07)
- copilot: use responses API for subscription reasoning models (#647) (84ac332)
- correct preserved-entry index mapping in Gemini content round-trip (#836) (0ffe2b6)
- dashboard: stable 'Proxy $ Saved' hero tile under --workers > 1 (#481) (fd73b88)
- don't inject empty tools:[] when client omitted the tools field (#772) (574bbae)
- harden Copilot API auth token handling (#557) (6b0c09f)
- health: readyz verifies upstream connectivity, not just process liveness (#744) (5dfb446)
- init: guard persistent task startup (#616) (9252d85)
- init: normalize Windows hook paths to forward slashes (#788) (6ea6e31)
- init: suppress hook recovery output (#760) (b439599)
- learn: claude-cli streams output with idle timeout (#373) (9bff575)
- make headroom wrap readiness probe timeout configurable for slow ML imports (#581) (163677b)
- parser: detect waste signals in Anthropic tool_result content blocks (#815) (929698a)
- proxy: F4 — trust X-Forwarded-* only behind allow-listed gateway (d10bd5f)
- proxy: lazy-import server to avoid fastapi crash (#442) (93c6937)
- proxy: make CCR multi-worker warning conditional on backend (#770) (d76a729)
- proxy: make Kompress eager preload cache-only so a cold cache can't block startup (#783) (841663d)
- proxy: restore Codex usage headers on WS and streaming SSE transports (#577) (#794) (0ce68de)
- schema compaction must not drop property names that match DROP_KEYS (#785) (ae2122f)
- security: block DNS-rebinding on /debug/* and /stats/reset via Host-header allowlist (#605) (b4b5025)
- ssl: upstream httpx client inherits SSL_CERT_FILE, REQUESTS_CA_BUNDLE, NODE_EXTRA_CA_CERTS (#745) (e50fbb3)
- suppress LiteLLM provider banner before import (#874) (f9384ef)
- transforms: use thread-local tree-sitter parsers to prevent pyo3 Unsendable panic (#604) (2ad300a)
- wrap: track shared proxy clients with markers (#877) (05bd56b)
Code Refactoring
- extract litellm model resolution to shared utility (ec7d006)