PyOD 3.2.1 — od-expert skill correctness + demo redesign

Point release fixing correctness regressions shipped in the v3.2.0 od-expert skill, with a redesigned interactive demo and a pytest safety net to prevent this class of regression going forward. No breaking API changes.

What's fixed

17+ API reference errors in the v3.2.0 skill content: state.plan[...] → state.plans, state.scores → state.consensus['scores'], state.best_detector → state.analysis['best_detector'], and a phantom engine.start(X, y=labels) supervised path (there is no y parameter on ADEngine.start). Labelled data should use the classic XGBOD.fit(X, y) / predict(X) path; this is now what the skill and docs recommend.
Invented profile keys removed. state.profile['estimated_contamination'] and state.profile['encoder'] were cited in v3.2.0 prose but do not exist on InvestigationState. The skill now tells the agent to observe state.analysis['consensus_analysis']['anomaly_ratio'] post-run instead.
Hardcoded state.plans[:3] lists re-probed against a live ADEngine and corrected across references/tabular.md, references/time_series.md, references/graph.md, and references/text_image.md. The decision tables in those files are now explicitly labeled as expert heuristics, not predictions of engine.plan output; the agent is pointed at state.plans for the live selection.
Cardio walkthrough reproducibility. references/workflow.md now seeds np.random.seed(42), excludes the trailing label column (X = df.values[:, :-1]), and reports the numbers that one-pass code actually produces (172/1831 flagged at the default contamination of 0.1, validation precision 85/172 at recall 85/176).

What's new

pyod/test/test_skill_api_refs.py — a pytest safety net that walks ADEngine + InvestigationState via live dry runs (tabular / time series / text) and validates every state.X / state.X['a']['b'] / engine.X(...) reference in the skill content. Catches invalid keyword arguments through inspect.signature. Ships a synthetic negative test that fabricates all five regression shapes (bad attr, bad nested key, bad kwarg, missing method, missing prose attr) and asserts the scanner flags them.
Redesigned interactive demo at examples/agentic_demo.html — now uses a diabetes screening dataset (examples/data/pima.csv, 768 patients, 8 features) with dark "od-expert decisions" callouts alongside the agent's turns showing modality triage, top-10 pitfall checks, the 11 adaptive escalation triggers, and the resulting plan. Two-column CSS grid with sticky callouts and overflow guards for narrow viewports. (Previously Pima Indians Diabetes; renamed to diabetes screening in user-visible text.)
scripts/render_agentic_demo.py — Playwright headless-Chromium script that regenerates docs/figs/agentic-demo.png from the HTML demo source. Re-runnable any time the HTML changes so the readthedocs figure stays in sync.
Expanded docs at docs/examples/agentic.rst with a new "What the skill encodes" section documenting the master decision tree, top-10 pitfalls, 11 escalation triggers, on-demand reference files, KB-derived detector list, and CI safety nets.

Review notes

4 rounds of /implement-review with Codex on this batch; 9 findings total (3 High + 4 Medium + 2 Low), all resolved.
Highlight: the new safety net was designed around the exact regression shapes Codex's Round 1 review surfaced, plus Round 2's finding that the first version of the test missed nested dict-key chains and impossible call signatures.

Install

pip install --upgrade pyod
pyod install skill    # refresh ~/.claude/skills/od-expert/ with the fixed content
pyod info

v3.0.0 / v3.1.0 / v3.2.0 user code keeps working unchanged.

yzhao062/pyod v3.2.1 on GitHub

PyOD 3.2.1 — od-expert skill correctness + demo redesign

What's fixed

What's new

Review notes

Install

yzhao062/pyod v3.2.1
on GitHub