github uditgoenka/autoresearch v1.9.11
v1.9.11 — Crash Recovery, Metric-Valued Guards, Plateau Detection & Validation Hardening

latest releases: v2.1.2, v2.1.1, v2.1.0...
one month ago

What's New

Metric Extraction Validation (PR #63)

  • Mandatory numeric validation — extracted values must match ^-?[0-9]+\.?[0-9]*$ before any decision logic runs
  • metric-error status — new iteration status for non-numeric extraction failures
  • Two-consecutive-error halt — stops the loop (even unbounded) when the verify pipeline is confirmed broken
  • Diagnostic output — shows raw verify output on failure so the problem is visible
  • Whitespace trim — strips leading/trailing whitespace before validation
  • macOS grep compatibility — uses grep -oE instead of grep -oP throughout

Plateau Detection (PR #64)

  • Unbounded mode safety net — tracks best_metric and iterations_since_best to detect when the loop burns tokens without real progress
  • Configurable patiencePlateau-Patience: N (default 15), or Plateau-Patience: off for overnight runs
  • Three-option pause UI — stop / continue with reset patience / change strategy
  • Measured-only counting — skips no-op, crash, metric-error, hook-blocked iterations
  • Bounded mode unaffected — iteration limit already bounds the run

Session Crash Recovery (PR #65)

  • Three-state detection — dirty tree (uncommitted edits), unverified experiment (committed but not verified), clean state
  • Automatic recovery on resume — discards uncommitted changes, reverts unverified experiments, resumes normally from clean state
  • Phase 4 safety — cleans up staged changes if git commit itself fails (disk full, hook timeout, permissions)
  • Simplicity override removed — subjective "complexity" and "simpler" judgments eliminated from Phase 6; the metric alone decides

Metric-Valued Guards (PR #66)

  • Threshold-based guardsGuard-Direction + Guard-Threshold params for numeric tolerance instead of binary pass/fail
  • Backward compatible — without the new params, guards operate in pass/fail mode as before
  • guard-metric column — new TSV column tracks guard-metric values for drift visibility over time
  • Plan wizard integration — line-count guard and metric-valued guard options folded into Phase 4.5

Plan Wizard Improvements (PRs #65 + #66)

  • Improved dry-run validation — shows raw output, extracted value, numeric check, and troubleshooting table
  • Common failure fixes table — documents fixes for 85.2%, 342ms, empty output, wrong awk field, multi-line output
  • Guard options expanded — line-count bloat guard + metric-valued threshold guard added to Phase 4.5

Breaking Changes

None. All changes are backward compatible. Existing configs work without modification.

Full Changelog

  • 140eca2 fix: add metric extraction validation to prevent garbage-data decisions
  • be54359 fix: add whitespace trim and macOS grep compatibility
  • ff3eea7 feat: add plateau detection to unbounded mode
  • 0c67f58 fix: add best_iteration tracking and broaden halt condition references
  • 1b17ad0 feat: add session crash recovery, metric-valued guards, remove simplicity override

Don't miss a new autoresearch release

NewReleases is sending notifications on new releases.