yzhao062/pyod v3.5.4 on GitHub

Summary

v3.5.4 bundles two things: the KB-tools API for agent-driven and LLM-driven detector routing (staged as v3.5.3 but never published to PyPI), and a claims-honesty remediation pass that aligns the v3 agentic-layer docs, skill, CLI, and examples with what an internal audit could verify. Both halves were reviewed across multiple implement-review rounds with Codex.

KB-Tools API (Surfaces 1 and 2)

ADEngine.get_kb_for_routing(profile, top_k=3, constraints=None) returns a structured KB snapshot of every shipped detector (strengths, weaknesses, best_for, avoid_when, complexity, benchmark rank, modality match), filtered by constraints.exclude_detectors / constraints.data_type_strict and sorted by modality-specific benchmark rank.
ADEngine.make_plan(detector_choices, justifications=None, params=None) validates a caller-chosen ordered detector list against the KB and returns a DetectionPlan consumable by build_detector / run.
ADEngine.plan_detection(profile, *, llm_client=callable, top_k=3, llm_strict=None) accepts a user-supplied (prompt: str) -> str callable wrapping any LLM SDK. The engine builds the routing prompt, invokes the callable, parses the response, and returns the same DetectionPlan. On call or parse failure it falls back to rule routing with a RuntimeWarning; llm_strict=True (or PYOD3_LLM_STRICT=1) re-raises instead.
New pyod/utils/_llm.py: LLMCallable Protocol, RoutingParseError, build_routing_prompt, parse_routing_response.

Claims-Honesty Remediation

Detector count corrected to 60 across the CLI (pyod info), skill prose, pyproject.toml, and docs, by excluding the one planned / non-buildable entry from buildable counts.
separation quality metric reframed as a descriptive, label-free diagnostic computed from the run's own predicted labels. In ADEngine consensus this is circular (labels come from majority vote, scores are rank-averaged), so it is no longer presented as independent correctness evidence.
Determinism guarantee made precise. Shallow detectors are all seeded or deterministic by construction (verified: 0 nondeterministic cases); the random_state docstring states the verified guarantee.
Wording softened: "intelligent" to "lifecycle"; trust and quality language downgraded to diagnostics; no overclaiming of expert-quality output.

API Compatibility

Every v3.5.2 caller pattern produces identical output. The new top_k, llm_client, and llm_strict parameters are keyword-only with backward-compatible defaults. LLMCallable is a Protocol, not an inheritance-required base class. No breaking changes.

Tests

53 new tests for the KB-tools surfaces (schema, filters, ordering, KB validation, top_k clamping, LLM stub and fallback paths, strict-mode precedence), plus count-pinning tests (test_pyod_info_excludes_planned_detectors, test_skill_count_prose_matches_kb). All existing ADEngine tests continue to pass.

Install

pip install --upgrade pyod

PyPI goes 3.5.2 to 3.5.4 directly; 3.5.3 was staged but never published.