v2.5.0: multi-agent burst no longer kills the proxy
Incident (2026-03-22): ClawTeam spawning 5+ Opus agents caused cascading failure. The v2.4.0 circuit breaker (3 consecutive timeouts → open) tripped within seconds, blocking ALL requests globally — including unrelated agents and new sessions. With fallbacks: [], every message returned "LLM request timed out."
What changed
Sliding-window circuit breaker
The old breaker counted consecutive failures — 3 in a row and the entire model was blocked. With multiple agents hitting the proxy concurrently, this threshold was reached almost instantly.
The new breaker counts failures in a 5-minute sliding window (configurable). Default threshold raised from 3 to 6. A burst of concurrent timeouts no longer cascades into a global outage.
Graduated backoff
Cooldown now doubles on each re-open (120s → 240s → 300s cap), preventing the open/half-open oscillation loop that kept the breaker permanently tripped. Resets fully on first success.
Multi-probe half-open
Half-open state now allows 2 concurrent probe requests (was 1). Recovery is faster because we don't gate everything on a single test request.
Increased timeout defaults
Designed for real agent workloads with large system prompts (30K+ chars):
| Parameter | Old | New |
|---|---|---|
| Overall timeout | 120s | 300s |
| Opus first-byte | 60s base | 90s base |
| Sonnet first-byte | 45s base | 60s base |
| Max concurrent | 5 | 8 |
| Breaker threshold | 3 consecutive | 6 in 5min window |
| Breaker cooldown | 60s fixed | 120s graduated |
Health endpoint improvements
/health now exposes per-model breaker state: window failures, cooldown, reopen count, probe slots. Status shows "degraded" when any breaker is open.
New env vars
| Variable | Default | Description |
|---|---|---|
CLAUDE_BREAKER_WINDOW
| 300000
| Sliding window duration (ms) |
CLAUDE_BREAKER_HALF_OPEN_MAX
| 2
| Max concurrent probes in half-open |
Upgrade
cd ~/.openclaw/projects/claude-proxy
git pull
# Restart your proxy (launchctl, systemd, or manual)New defaults take effect immediately. If you have custom env vars in your plist/service, review the updated defaults above.
Full Changelog: v2.4.0...v2.5.0