What's Changed
- [ROB-3242] Eval fix by @Avi-Robusta in #1740
- Document behavior_controls API parameter for prompt customization by @aantn in #1743
- Weekly Benchmark Results 2026-03-11_21-08 by @github-actions[bot] in #1749
- Add large Confluence page evaluation test case by @aantn in #1738
- Add max_prompt_tokens_per_call tracking to LLM cost reporting by @aantn in #1755
- fix(llm): load MODEL env fallback and improve no-model guidance by @pavangudiwada in #1723
- Fix overly-big-toolcalls handling by @aantn in #1756
- Allow returning to toolset selection after configuration by @aantn in #1758
- Add refactoring plan for unifying call() and call_stream() by @aantn in #1763
- Weekly Benchmark Results 2026-03-14_20-45 by @github-actions[bot] in #1766
- Weekly Benchmark Results 2026-03-15_04-11 by @github-actions[bot] in #1772
- Refactor call() to unify with call_stream() and other improvements by @aantn in #1765
- benchmark summary by @Avi-Robusta in #1776
- Update OpenAI icon from simple-openai to fontawesome-brands-openai by @aantn in #1777
- Support multi-round approval workflows with iteration offset by @aantn in #1774
- fix: enrich PagerDuty issues with description and alert body details by @yakir-shriker in #1780
- Implement schema resolution for JSON Schema references and compound types by @mouchar in #1713
- Rename grafana-dashboard tag to grafana by @aantn in #1786
- fix: forward --model parameter in investigate ticket command by @yakir-shriker in #1779
- Add tool_results_dir parameter to ToolCallingLLM in custom_llm example by @aantn in #1781
- Add new integrations and update integration documentation by @aantn in #1788
- Add support for evals-model-* labels in regression workflow by @aantn in #1784
- Refactor config class handling to support multiple config classes by @naomi-robusta in #1741
- Document CLASSIFIER_MODEL env var and OpenRouter requirements by @aantn in #1792
- holmes-mcp docs update by @Avi-Robusta in #1793
- Soften runbook fetching requirements and reduce enforcement language by @aantn in #1787
- Clarify multi-account AWS setup and agent deployment options by @aantn in #1795
- Strip newlines from sanitized parameters to prevent shell syntax errors by @aantn in #1773
- Enable strict tool calling universally with per-tool compatibility checks by @aantn in #1790
- Remove message truncation logic, fail fast on context overflow by @aantn in #1797
- [ROB-3260] Fix Anthropic image token counting by @Avi-Robusta in #1642
- Document tool approval behavior and update API reference by @aantn in #1800
- Add new k8s evals for kubernetes_jq_query by @aantn in #1759
- Add PR label support and improve eval parameter naming by @aantn in #1799
- Add MCP Confluence image attachment test case (eval 233) by @aantn in #1796
- ROB-3472: Add datasource-catalog.json by @alonelish in #1778
- Patch CVE-2025-68121 by @moshemorad in #1667
New Contributors
- @yakir-shriker made their first contribution in #1780
- @mouchar made their first contribution in #1713
- @alonelish made their first contribution in #1778
Full Changelog: 0.22.0-alpha...0.22.0-alpha.1