github uditgoenka/autoresearch v2.1.0
v2.1.0 — Complete Rebuild: Modular Architecture

latest releases: v2.1.2, v2.1.1
one day ago

v2.1.0 — Complete Rebuild: Modular Architecture

Everything in this release has been rewritten from scratch. The entire codebase, all documentation, all guides, and the distribution system have been rebuilt from the ground up for v2.1.0.


Why a Complete Rewrite?

The v2.0.x monolithic SKILL.md (813 lines, ~100K tokens) loaded on every single invocation — even for a quick /autoresearch:plan. This was unsustainable:

  • Token waste: ~100K tokens per invocation regardless of which command you used
  • Slow cold starts: LLMs had to parse 813 lines before doing anything useful
  • Tangled protocols: 13 reference files with overlapping, hard-to-maintain instructions
  • Fragile distribution: Separate sync-opencode.sh and sync-codex.sh scripts that frequently drifted

v2.1.0 solves all of this with a ground-up architectural rebuild.


Architecture: Monolith → Modular

Before (v2.0.x)

SKILL.md (813 lines, ~100K tokens) ← loaded EVERY invocation
├── 13 reference files (overlapping workflows)
├── autoresearch-command-spec.json (JSON command registry)
├── autoresearch_cli.py (Python wrapper)
├── sync-opencode.sh (manual sync)
└── sync-codex.sh (manual sync)

After (v2.1.0)

SKILL.md (41 lines, ~2K tokens) ← thin routing table only
├── 12 self-contained command files (~100 lines each, ~5-8K tokens)
├── 3 focused reference files (loaded only when needed)
└── scripts/transform.sh (single multi-platform transform)

Result: ~95% token reduction per invocation. Only the command you invoke gets loaded. A /autoresearch:plan call loads ~5K tokens instead of ~100K.


12 Commands (was 11 — evals is new)

Every command has been rewritten from scratch with bounded defaults, universal flags, and chain handoff support.

Command Type Default Purpose
/autoresearch Loop 25 iterations Core metric-driven improvement loop
/autoresearch:plan One-shot Goal → config wizard (generates loop config)
/autoresearch:debug Loop 15 iterations Hypothesis-driven bug investigation
/autoresearch:fix Loop 20 iterations Error-count reduction (error → zero)
/autoresearch:security Loop 15 iterations STRIDE + OWASP security audit with auto-fix
/autoresearch:ship Linear 8 phases Pre-merge quality pipeline
/autoresearch:scenario Loop 20 iterations 12-dimension edge case exploration
/autoresearch:predict One-shot 5-persona swarm analysis + debate
/autoresearch:learn Loop 10 iterations Autonomous documentation engine
/autoresearch:reason Loop 8 iterations Adversarial refinement (judge + critic)
/autoresearch:probe Loop 15 rounds Requirement interrogation (8 personas)
/autoresearch:evals One-shot (NEW) TSV analysis: trends, plateaus, anomalies

New: /autoresearch:evals

One-shot analysis of any *-results.tsv file. Dynamically detects TSV columns, identifies trends, plateaus, velocity changes, regressions, and diminishing returns. Adaptive checkpoints at floor(max_iterations/3) provide mid-loop feedback during long runs. Backward compatible with v2.0.x TSV format.

Can also be invoked inline on any looping command via --evals or --evals-interval N flags.


Bounded Defaults

Every looping command now has a sensible default iteration count. No more "runs forever unless you stop it":

  • Iterations: N — explicit cap
  • Iterations: unlimited — opt-in infinite mode (you must ask for it)
  • Default values are tuned per command: core loop (25) needs more iterations than reason (8)

Chain Handoff via handoff.json

Commands produce structured handoff.json files that downstream commands consume automatically:

/autoresearch:predict --chain debug,fix,ship

This runs predict → passes findings to debug → passes root causes to fix → passes changes to ship. Zero manual context transfer between commands.


Universal Flags (All Commands)

Flag Purpose
Iterations: N Override default iteration count
Iterations: unlimited Remove iteration cap
--evals Run evals checkpoint at floor(N/3)
--evals-interval N Custom evals checkpoint interval
--chain <targets> Chain to downstream command(s) on completion

Multi-Platform Support

All three platforms supported from a single canonical source:

Platform Syntax Distribution
Claude Code /autoresearch:debug .claude/commands/ + .claude/skills/
OpenCode /autoresearch_debug .opencode/commands/ + .opencode/skills/
Codex $autoresearch debug plugins/autoresearch/ + .agents/skills/

Single transform script: scripts/transform.sh replaces the old sync-opencode.sh + sync-codex.sh pair. One command generates all platform distributions.


Documentation — Completely Rewritten

Every documentation file has been rewritten from scratch for v2.1.0:

README.md (rewritten)

  • v2.1.0 badge, 12-command table with defaults
  • Updated architecture tree, FAQ, install instructions
  • Evals section, bounded defaults explanation

docs/ (7 files — all rewritten)

  • system-architecture.md — Mermaid component + data flow diagrams for v2.1.0 modular architecture
  • codebase-summary.md — updated file inventory, key decisions table
  • code-standards.md — self-contained command file pattern, naming conventions
  • project-overview-pdr.md — product requirements for v2.1.0
  • development-roadmap.md — historical milestones through v2.1.0
  • changelog.md — full version history
  • project-changelog.md — detailed change log for v2.1.0

guide/ (18 files — all rewritten or new)

  • README.md — v2.1.0 guide index with 12 commands
  • getting-started.md — all 3 platforms, bounded defaults, chain handoff
  • Individual command guides (12 files) — each rewritten with flags tables, examples, chain patterns
  • autoresearch-evals.mdbrand new guide for the evals command
  • chains-and-combinations.md — handoff.json protocol, all 12 commands
  • examples-by-domain.md — 13 domains with v2.1.0 syntax
  • advanced-patterns.md — guards, CI/CD, MCP, transform.sh
  • autoresearch-codex.md — v2.1.0 Codex distribution (no more JSON spec)

Other docs (rewritten)

  • CONTRIBUTING.md — v2.1.0 repo structure, new command pattern, transform.sh
  • COMPARISON.md — 12 commands, evals feature, updated architecture comparison

Files Removed

File Reason
plugins/autoresearch/resources/autoresearch-command-spec.json Command contracts now live in individual command files
plugins/autoresearch/scripts/autoresearch_cli.py Python wrapper CLI no longer needed
plugins/autoresearch/scripts/install_local_plugin.py Replaced by scripts/install.sh
scripts/sync-opencode.sh Replaced by scripts/transform.sh
scripts/sync-codex.sh Replaced by scripts/transform.sh
13 old reference files in claude-plugin/ Replaced by 3 focused reference files

Reference Files: 13 → 3

The old 13 workflow reference files (autonomous-loop-protocol, core-principles, debug-workflow, fix-workflow, learn-workflow, plan-workflow, predict-workflow, probe-workflow, reason-workflow, results-logging, scenario-workflow, security-workflow, ship-workflow) have been replaced by 3 focused files that are only loaded when their specific command needs them:

File Used by
references/security-checklist.md /autoresearch:security
references/predict-personas.md /autoresearch:predict
references/reason-judge-protocol.md /autoresearch:reason

All other protocol is embedded directly in the self-contained command files — no external loading needed.


Install

npx skills add uditgoenka/autoresearch

Or manual install via scripts/install.sh.


Migration from v2.0.x

No action needed — the plugin system handles the update automatically. Your existing TSV result files are backward compatible with the new evals command.

If you have custom scripts referencing old files (autoresearch-command-spec.json, sync-opencode.sh, sync-codex.sh), update them to use scripts/transform.sh and the individual command files.


Full Changelog

v2.0.04...v2.1.0

Don't miss a new autoresearch release

NewReleases is sending notifications on new releases.