Highlights
- O3 detector failure recovery.
ADEngine.run()now signals partial-detector failure vianext_action.action='recover_detector_failure', listing the failed detectors and a planner-suggested replacement set. The newiterate(state, {'action': 'recover'})action substitutes failed slots while preserving successful ones (no silent auto-substitution); pass{'detectors': [...]}to override the suggestion. Theiterate()phase guard accepts both'detected'and'analyzed'for'recover'so the agent can substitute immediately afterrun(). Other actions still require'analyzed'. - O5 contamination diagnostics. New read-only
engine.contamination_diagnostics(state, threshold_sweep=...)reports the contamination value the run actually used, the consensus flagged rate, score percentiles (50/75/90/95/99), and an optional sweep showing what fraction would be flagged at each candidate contamination value. No state mutation; the agent uses these numbers to choose a value before iterating. - O8 hindsight validate. New
engine.validate(state, y)returns label-based metrics (precision, recall, F1, ROC AUC, AP) for the consensus, every successful detector, and the analyzer-selected best detector, plus aconsensus_helpedflag and FP/FN row indices. Pure functional; ROC AUC and AP returnNonewhenyhas only one class instead of raising. - O9 enriched explain_findings.
explain_findingsacceptsfeature_namesand threads it through tofeature_contributions, which now returns enriched dicts withfeature,name,value,mean,z_score, anddirection(high/low). Backward compatible: existingfeatureandz_scorekeys are preserved. - TA1 benchmark-rank
compare_detectors. Whennamesis omitted,compare_detectorsconsults benchmark rankings instead of returning catalog-order slices: ADBenchoverall_top_5for tabular, per-detectorbenchmark_rankfor time series via TSB-AD. Modalities without a ranking fall back to catalog order. - TA2 plan-level contamination.
plan_detectionnow always exposes effective contamination inplan['params']so the MCPplan_detection->build_detectorchain emits a code snippet that names the value the agent will run with. - MCP-B1 stateless Tier-B tools. The MCP server registers three new tools (
run_detection,analyze_results,explain_findings), bringing the registered tool count to ten and letting an agent close the plan -> run -> analyze -> explain loop without local glue. Round-tripping uses JSON; numpy arrays move as lists and are rebuilt for the engine call. Statefulinvestigate/iterateMCP tools remain deferred. - od-expert skill.
workflow.mdautonomous-loop step 4 documents the recovery branch; Trigger 4 separates "low separation" (try a different mix) from "low stability" (adjust contamination); Trigger 2 points the agent atcontamination_diagnostics. - README narrowed. Agentic claims now match what the surface delivers today; the layers paragraph lists all ten MCP tools grouped by tier.
Backward compatibility
No breaking API changes. New methods are additive (validate, contamination_diagnostics, the 'recover' iterate action). explain_findings and feature_contributions add fields without removing old ones. compare_detectors ordering changes when names is omitted (catalog order to benchmark rank), which is intended behavior for an agent-facing default.
Tests
80 new tests across test_recover.py, test_validate.py, test_contamination_diagnostics.py, test_ad_engine_compare.py, expanded test_ad_engine.py, and expanded test_mcp_server_import.py. Test infrastructure: test_readme_rst.py falls back to OptionParser on docutils < 0.19; test_thresholds.py skips the class when pythresh's resource loading is broken (Python 3.9 + pythresh 1.1.0 wheel lacks pythresh/models/__init__.py); test_linear_block checks output shape rather than exact equality with torch.zeros(2, 1).
Pull requests bundled
- #678 -- Release v3.4.0
The audit-cycle commits (37e8a6a..d4e10d0) landed directly on development as squash merges from local fix/* branches under the implement-review loop with Codex.
Install
pip install --upgrade pyod
PyPI: https://pypi.org/project/pyod/3.4.0/
Source
- Hindsight observations:
docs/superpowers/research/2026-05-09-agentic-hindsight-observations.md - MCP test notes:
docs/superpowers/research/2026-05-09-mcp-spark-test-notes.md