What's New
Smarter Rate Limiting
- Model-specific rate limits: Rate limits now apply per-model instead of globally. If one model is exhausted, you can still use other models on the same account.
- Better status visibility:
/account-limits?format=tablenow shows(x/y) limitedindicating exactly how many models are rate-limited per account.
Improved Reliability
- Network error handling: Transient network errors (timeouts, connection resets) now trigger automatic retry with the next account instead of failing immediately.
- Thinking budget validation: Automatically adjusts
max_tokenswhen it's less thanthinking_budgetto prevent API errors.
Enhanced Monitoring
- Detailed health endpoint:
/healthnow returns per-account model quotas with remaining percentages and reset times. - Clearer error messages: Rate limit errors now include the affected model name for easier debugging.
CLI Improvements
- Exit option: Added
(e)xitoption to the accounts CLI menu. - Single account flow: Account addition now handles one account at a time to avoid OAuth state conflicts.
- No more hanging: CLI properly exits after completion.
Other
- Startup banner now shows active modes (e.g., debug mode)
- Only Claude and Gemini models are returned from
/v1/models
Full Changelog: v1.2.3...v1.2.4