github Priivacy-ai/spec-kitty v3.2.0a5

latest release: v3.2.0a6
pre-release6 hours ago

Fixed

  • CLI auth now consumes the server Tranche 2 contract end to end: logout posts refresh tokens to /oauth/revoke, local credential cleanup failures are reported truthfully, refresh handles benign 409 replay without resubmitting a spent token, and auth doctor --server checks /api/v1/session-status with safe re-authentication guidance (#902).
  • Fix spec-kitty upgrade silently leaving projects in PROJECT_MIGRATION_NEEDED state by stamping schema_version after metadata save (#705, WP01).
  • spec-kitty init in a non-git directory now prints an actionable "run git init" message (#636, WP05).
  • Suppress misleading "shutdown / final-sync" red error lines after a successful spec-kitty agent mission create --json payload (#735, WP06).
  • Deduplicate "Not authenticated, skipping sync" / "token refresh failed" diagnostics to at most once per CLI invocation (#717, WP06).
  • Fix read_events() raising KeyError('wp_id') on DecisionPointOpened / DecisionPointResolved events that share status.events.jsonl with lane-transition events. Restores finalize-tasks / materialize / dashboard for any mission that uses the Decision Moment Protocol (#830, WP08).

Changed

  • Loosen .python-version from a hard 3.13 pin to 3.11 (the floor declared by pyproject.toml) and restore mypy --strict cleanliness on mission_step_contracts/executor.py (#805, WP03).

Removed

  • Retire the deprecated /spec-kitty.checklist command surface from every supported agent's rendered output. The canonical requirements checklist at kitty-specs/<mission>/checklists/requirements.md is unaffected (#815, supersedes #635, WP04).

Internal

  • Add regression tests confirming --feature aliases stay hidden from --help while remaining accepted (#790, WP07).
  • Add regression test confirming spec-kitty agent decision command shape stays consistent across docs / help / skill snapshots (#774, WP07).

Added

  • Frontend Freddy agent profile — browser-side implementer specialising in HTML/CSS/JavaScript/TypeScript, component frameworks (React, Vue, Svelte), WCAG 2.1 accessibility, Core Web Vitals performance, and frontend testing (vitest, Playwright). Specialises from implementer-ivan. Self-review protocol enforces lint, type-check, unit/component tests, e2e smoke, axe accessibility gate, and bundle budget. Avoidance boundary explicitly names Node Norris's server-side domain.
  • Node Norris agent profile — server-side Node.js implementer specialising in HTTP APIs (Express/Fastify/NestJS), async/Promise discipline, streaming, npm security (npm audit), and integration testing (supertest). Specialises from implementer-ivan. Avoidance boundary explicitly names Frontend Freddy's browser-rendering domain. The two profiles are mutually exclusive by design.
  • BDD paradigm (behaviour-driven-development) — encodes BDD as a three-phase collaboration practice: Discovery (Three Amigos conversations), Formulation (Given/When/Then specifications), and Automation (executable living documentation). References DIRECTIVE_034 and DIRECTIVE_037.
  • BDD Scenario Lifecycle procedure (bdd-scenario-lifecycle) — covers the Formulation → Automation → Maintenance phases that follow an Example Mapping Workshop. Toolchain-agnostic (Cucumber-JVM, Cucumber-JS, Behave, SpecFlow). Encodes four anti-patterns: imperative Gherkin, rubber-stamp scenarios, shared mutable state, and orphaned step definitions.
  • New tactics:
    • reference-architectural-patterns — structured selection of named reference patterns (Layered, Hexagonal, Event-Driven, CQRS, Microservices, Modular Monolith) scored against coupling, scalability, and operational complexity constraints.
    • development-bdd — architecture-level BDD tactic for expressing observable behavioral contracts at system boundaries before implementation; distinct from the existing behavior-driven-development technique tactic.
    • bug-fixing-checklist — language-agnostic test-first defect resolution: write a reproduction test before touching production code.
    • test-readability-clarity-check — dual-perspective reconstruction check: read only tests, reconstruct system understanding, compare against spec to surface documentation gaps.
    • code-documentation-analysis — brownfield boundary discovery by extracting and clustering domain terminology from code and documentation artifacts. Contributes foundational analysis tactics toward the brownfield investigation skill described in #666.
    • terminology-extraction-mapping — systematic extraction and relationship mapping of domain terms across multiple sources to produce a maintainable glossary. Complementary artifact to the bounded-context linguistic discovery approach targeted by #666.
  • Tactic directory normalization — shipped tactics reorganised into four category subdirectories: testing/ (15 tactics), analysis/ (14), communication/ (7), architecture/ (14). Cross-cutting tactics remain in the shipped/ root. The existing rglob loader requires no changes.
  • tasks-finalize command skill — added to CANONICAL_COMMANDS in the agent skills pipeline and deployed to .agents/skills/spec-kitty.tasks-finalize/. Closes the gap where this command was missing from Codex/Vibe skill packages.

Changed

  • Profile enrichment — four existing profiles updated with additive tactic and paradigm references:
    • implementer-ivan: bug-fixing-checklist tactic reference (propagates to all specialist profiles via resolve_profile() union merge).
    • reviewer-renata: test-readability-clarity-check and bdd-scenario-lifecycle tactic references; behaviour-driven-development paradigm in context sources.
    • architect-alphonso: development-bdd tactic reference; BDD paradigm, example-mapping-workshop, and bdd-scenario-lifecycle in additional context sources.
    • java-jenny: behavior-driven-development and bdd-scenario-lifecycle tactic references; bdd-scenarios self-review step (Cucumber-JVM + Serenity BDD gate).
  • behavior-driven-development tactic enriched — extended notes with a toolchain landscape section (Cucumber family, Playwright, Selenium, Serenity BDD, custom DSLs; source: patterns.sddevelopment.be/primers/toolchain-and-automation/bdd); three new failure_modes (rubber-stamp scenarios, shared mutable state between scenarios, orphaned step definitions); cross-references to the new BDD paradigm and procedure.
  • tactic-references union-merged in resolve_profile()tactic-references added to _LIST_FIELDS in src/doctrine/agent_profiles/repository.py. Specialist profiles now inherit base-profile tactic references via _union_merge at resolution time rather than overriding them.
  • Tactic compliance test extendedtest_tactic_compliance.py ARTIFACT_DIRS now includes procedure and paradigm types, enabling cross-type reference validation for tactics that reference procedures or paradigms.
  • Shared package boundary cutover (mission shared-package-boundary-cutover-01KQ22DS) — spec-kitty-runtime is no longer a dependency of spec-kitty-cli. The CLI now owns its own runtime internally under src/specify_cli/next/_internal_runtime/; spec-kitty next works from a clean install of spec-kitty-cli alone. spec-kitty-events and spec-kitty-tracker are external PyPI dependencies consumed via their public import surfaces (spec_kitty_events, spec_kitty_tracker). The vendored events tree under src/specify_cli/spec_kitty_events/ has been removed (~23 kLoC). Developers who relied on editable cross-package overrides should consult docs/development/local-overrides.md; operators upgrading from a pre-cutover release should consult docs/migration/shared-package-boundary-cutover.md. Decision rationale recorded in ADR 2026-04-25-1.

Removed

  • constraints.txt — the file existed solely to paper over a transitive pin conflict with the retired spec-kitty-runtime package and is no longer needed.

Fixed

  • spec-kitty agent config list/status now checks global command roots for slash-command agents instead of reporting missing project-local command directories after init.
  • spec-kitty agent config add/sync --create-missing no longer recreates retired project-local command directories for globally managed slash-command agents.
  • spec-kitty agent config remove/sync now removes only the managed command surface for project-local agent directories, preserving unrelated files such as .github/workflows/.

Added — Documentation mission composition rewrite (#502, #461, Phase 6 WP6.4)

  • Documentation mission now runs on the StepContractExecutor composition substrate, mirroring research (#504) and software-dev (#503). The runtime resolves the new composed step contracts ahead of the legacy mission.yaml workflow via the existing _resolve_runtime_template_in_root precedence — no loader changes were required.
  • New runtime sidecar templates: src/specify_cli/missions/documentation/mission-runtime.yaml and src/doctrine/missions/documentation/mission-runtime.yaml.
  • Six shipped step contracts under src/doctrine/mission_step_contracts/shipped/documentation-{discover,audit,design,generate,validate,publish}.step-contract.yaml.
  • Six action doctrine bundles under src/doctrine/missions/documentation/actions/{discover,audit,design,generate,validate,publish}/ (governance guidelines + directive/tactic indices).
  • DRG action nodes and edges for action:documentation/{discover,audit,design,generate,validate,publish} in src/doctrine/graph.yaml.
  • Composition wiring in src/specify_cli/next/runtime_bridge.py: _COMPOSED_ACTIONS_BY_MISSION["documentation"] and a fail-closed guard branch in _check_composed_action_guard() raising a structured error for unknown documentation actions. src/specify_cli/mission_step_contracts/executor.py adds six _ACTION_PROFILE_DEFAULTS entries (researcher-robbie for discover/audit, architect-alphonso for design, implementer-ivan for generate, reviewer-renata for validate/publish).
  • Real-runtime integration walk at tests/integration/test_documentation_runtime_walk.py proving SC-001 / SC-003 / SC-004 from a freshly initialized temp repo.

Backward compatibility

  • The legacy src/specify_cli/missions/documentation/mission.yaml and src/doctrine/missions/documentation/mission.yaml files remain on disk for backward reference. Existing documentation-mission projects that authored against the legacy workflow continue to work; runtime template resolution prefers the new mission-runtime.yaml ahead of the legacy file via the existing precedence in _resolve_runtime_template_in_root (no loader changes in this PR).

Added

  • Upgrade compatibility plannerspec-kitty upgrade now separates CLI
    update guidance from current-project schema compatibility. New flags
    --cli, --project, --yes, and --no-nag support CLI-only guidance,
    project-only migrations, non-interactive confirmation, and explicit nag
    suppression. spec-kitty upgrade --dry-run --json emits the stable
    compatibility-plan contract for automation.
  • Host-surface parity matrix at docs/host-surface-parity.md — authoritative record of how each of the 15 supported host surfaces teaches the advise/ask/do governance-injection contract. Closes the remaining #496 host-surface breadth rollout.
  • Mode of work runtime derivation — every advise, ask, do invocation now records its mode_of_work (advisory, task_execution, mission_step, query) on the started event. Derivation is from the CLI entry command.
  • Correlation linksspec-kitty profile-invocation complete accepts --artifact <path> (repeatable) and --commit <sha> (singular); each appends an additive event to the invocation JSONL for single-file request→artifact/commit correlation.
  • SaaS read-model policy at src/specify_cli/invocation/projection_policy.py — typed module mapping (mode, event) to projection rules. Documented in docs/trail-model.md.
  • Tier 2 SaaS projection decision — decisively documented as deferred in docs/trail-model.md. Tier 2 evidence stays local-only in 3.2.x.
  • README Governance layer subsection — entry point for operators discovering the advise/ask/do surface.
  • Decision Moment Ledger (V1) — new spec-kitty agent decision subgroup with five
    subcommands: open, resolve, defer, cancel, verify. Mints ULID decision_ids
    at interview ask-time, writes paper trail under kitty-specs/<mission>/decisions/
    (index.json + DM-<id>.md), and appends DecisionPointOpened(interview) /
    DecisionPointResolved(interview) events to status.events.jsonl. Local-only;
    no SaaS sync required.
  • Charter integrationspec-kitty charter interview now calls decision open
    before each question and the appropriate terminal command after each answer.
    answers.yaml behavior is unchanged.
  • Specify + Plan template updatesspecify.md and plan.md source templates
    gain a Decision Moment Protocol section instructing the LLM to call decision
    subcommands at ask/resolution time and write <!-- decision_id: <id> --> anchors
    for deferred decisions.
  • decision verify gate — scans spec.md / plan.md for
    [NEEDS CLARIFICATION: ...] <!-- decision_id: <id> --> sentinels and
    cross-checks against the decisions index. Exits non-zero on drift
    (DEFERRED_WITHOUT_MARKER, MARKER_WITHOUT_DECISION, STALE_MARKER).
  • Widen Mode (#758)spec-kitty agent decision widen + resolve --from-widen
    lifecycle. Writes widen-pending.jsonl, emits DecisionPointWidened events,
    integrates with charter/specify/plan widen affordances. Surfaces decision
    write-back errors explicitly instead of silently suppressing them.

Changed

  • Project schema compatibility is now enforced by the centralized compat
    planner. Out-of-date CLI notices are passive and throttled; incompatible
    project schemas block unsafe commands with exit codes 4, 5, or 6 and exact
    remediation guidance.
  • spec-kitty profile-invocation complete --evidence is now mode-gated: rejected on advisory / query invocations with InvalidModeForEvidenceError. Rejection occurs before any write; the invocation stays open.
  • _propagate_one consults the new projection policy after the sync-gate and authentication lookup. Existing task_execution / mission_step projection behaviour is preserved exactly.
  • Dashboard user-visible wording: the mission selector, current-mission header, overview heading, analysis heading, and empty-state prompt now read "Mission Run" / "mission" instead of "Feature". Backend identifiers (CSS classes, HTML IDs, cookie keys, API route segments, JSON field names) are unchanged.
  • spec-kitty-events bumped to ==4.0.0 — vendored copy at
    src/specify_cli/spec_kitty_events/ refreshed. Introduces
    DecisionPointOpenedInterviewPayload, DecisionPointResolvedInterviewPayload,
    OriginSurface.PLANNING_INTERVIEW (origin_surface: planning_interview),
    OriginFlow enum (values specify, plan), DecisionPointWidened, and
    TerminalOutcome enum.
  • [tool.uv.sources] redirects spec-kitty-events to ../spec-kitty-events/
    in editable mode for monorepo development. Dev-only; ignored by pip / PyPI.

Deferred

  • spec-kitty explain (issue #534) remains deferred to Phase 5 pending DRG glossary addressability (#499, #759).

Out of scope (tracked separately)

  • SaaS sync projection for widened decisions — tracked in spec-kitty-saas#110, #111.
  • Tasks-phase interview support — future mission.

Migration notes

No operator action required for routine upgrade. The trail model is additive:

  • Pre-mission invocation records (no mode_of_work) continue to accept --evidence and project under legacy task_execution rules.
  • Existing SaaS dashboards see no change for task_execution / mission_step traffic.
  • New advisory events now appear in the SaaS timeline as minimal entries without body — this is a deliberate behaviour change documented in the SaaS Read-Model Policy table.

Added (Phase 4 trail follow-on)

  • docs/trail-model.md: Formal operator documentation for the Phase 4 trail contract,
    mode-of-work taxonomy, tier promotion rules, SaaS projection policy, intake positioning,
    and explain deferral (WP04).
  • "Governance context injection" section in .agents/skills/spec-kitty.advise/SKILL.md
    for Codex/Vibe hosts, enabling Tier 1 trail recording without host-side SaaS auth (WP03).
  • "Standalone invocations (outside missions)" section in
    src/doctrine/skills/spec-kitty-runtime-next/SKILL.md for Claude Code and gstack hosts,
    covering when to open an invocation record outside the mission workflow (WP04).
  • End-to-end invocation integration tests in
    tests/specify_cli/invocation/test_invocation_e2e.py covering Tier 1 JSONL write,
    complete-event append, local-only list read, and sync-gate suppression (WP05).

Fixed

  • propagator.py (_propagate_one): Invocation events are now suppressed when
    effective_sync_enabled = False, even when the user is authenticated. Previously,
    sync-disabled checkouts could still emit SaaS events if a WebSocket client was
    connected (WP01).
  • executor.complete_invocation now calls promote_to_evidence() when the
    --evidence flag is supplied, enabling correct Tier 2 artifact promotion (WP03).

Changed

  • Issue #496: Priority-surface slice complete in 3.2.x (Claude Code via
    spec-kitty-runtime-next doctrine skill, Codex CLI via SKILL.md governance context
    injection). Remaining 9 surfaces tracked in #496 for a follow-on patch or Phase 5.
  • Issue #534: spec-kitty explain explicitly deferred to Phase 5 (requires DRG
    glossary addressability, issue #499). A partial implementation without glossary
    citations would be misleading.

Don't miss a new spec-kitty release

NewReleases is sending notifications on new releases.