PyOD 3.2.1 — od-expert skill correctness + demo redesign
Point release fixing correctness regressions shipped in the v3.2.0 od-expert skill, with a redesigned interactive demo and a pytest safety net to prevent this class of regression going forward. No breaking API changes.
What's fixed
- 17+ API reference errors in the v3.2.0 skill content:
state.plan[...]→state.plans,state.scores→state.consensus['scores'],state.best_detector→state.analysis['best_detector'], and a phantomengine.start(X, y=labels)supervised path (there is noyparameter onADEngine.start). Labelled data should use the classicXGBOD.fit(X, y)/predict(X)path; this is now what the skill and docs recommend. - Invented profile keys removed.
state.profile['estimated_contamination']andstate.profile['encoder']were cited in v3.2.0 prose but do not exist onInvestigationState. The skill now tells the agent to observestate.analysis['consensus_analysis']['anomaly_ratio']post-run instead. - Hardcoded
state.plans[:3]lists re-probed against a liveADEngineand corrected acrossreferences/tabular.md,references/time_series.md,references/graph.md, andreferences/text_image.md. The decision tables in those files are now explicitly labeled as expert heuristics, not predictions ofengine.planoutput; the agent is pointed atstate.plansfor the live selection. - Cardio walkthrough reproducibility.
references/workflow.mdnow seedsnp.random.seed(42), excludes the trailing label column (X = df.values[:, :-1]), and reports the numbers that one-pass code actually produces (172/1831 flagged at the default contamination of 0.1, validation precision 85/172 at recall 85/176).
What's new
pyod/test/test_skill_api_refs.py— a pytest safety net that walksADEngine+InvestigationStatevia live dry runs (tabular / time series / text) and validates everystate.X/state.X['a']['b']/engine.X(...)reference in the skill content. Catches invalid keyword arguments throughinspect.signature. Ships a synthetic negative test that fabricates all five regression shapes (bad attr, bad nested key, bad kwarg, missing method, missing prose attr) and asserts the scanner flags them.- Redesigned interactive demo at
examples/agentic_demo.html— now uses a diabetes screening dataset (examples/data/pima.csv, 768 patients, 8 features) with dark "od-expert decisions" callouts alongside the agent's turns showing modality triage, top-10 pitfall checks, the 11 adaptive escalation triggers, and the resulting plan. Two-column CSS grid with sticky callouts and overflow guards for narrow viewports. (PreviouslyPima Indians Diabetes; renamed todiabetes screeningin user-visible text.) scripts/render_agentic_demo.py— Playwright headless-Chromium script that regeneratesdocs/figs/agentic-demo.pngfrom the HTML demo source. Re-runnable any time the HTML changes so the readthedocs figure stays in sync.- Expanded docs at
docs/examples/agentic.rstwith a new "What the skill encodes" section documenting the master decision tree, top-10 pitfalls, 11 escalation triggers, on-demand reference files, KB-derived detector list, and CI safety nets.
Review notes
- 4 rounds of
/implement-reviewwith Codex on this batch; 9 findings total (3 High + 4 Medium + 2 Low), all resolved. - Highlight: the new safety net was designed around the exact regression shapes Codex's Round 1 review surfaced, plus Round 2's finding that the first version of the test missed nested dict-key chains and impossible call signatures.
Install
pip install --upgrade pyod
pyod install skill # refresh ~/.claude/skills/od-expert/ with the fixed content
pyod info
v3.0.0 / v3.1.0 / v3.2.0 user code keeps working unchanged.