github nyldn/claude-octopus v4.7.0
v4.7.0 - Adversarial Cross-Model Review

latest releases: v9.30.0, v9.29.3, v9.29.2...
3 months ago

🤼 Crossfire: Adversarial Cross-Model Review

This release introduces two new commands that leverage adversarial AI-vs-AI review to catch more issues than single-model review. Different models have different blind spots—Crossfire forces them to critique each other.

New Commands

grapple - Adversarial Debate

Codex vs Gemini wrestling match until consensus:

./scripts/orchestrate.sh grapple "implement password reset API"
./scripts/orchestrate.sh grapple --principles security "implement JWT auth"

Flow:

  1. Round 1: Both models propose solutions independently
  2. Round 2: Cross-critique (Gemini critiques Codex, Codex critiques Gemini)
  3. Round 3: Synthesis determines winner and final implementation

squeeze - Red Team Security Review

Blue Team defends, Red Team attacks:

./scripts/orchestrate.sh squeeze "review auth.ts for vulnerabilities"

Flow:

  1. Blue Team (Codex): Implements secure solution
  2. Red Team (Gemini): Finds vulnerabilities with exploit proofs
  3. Remediation: Fixes all found issues
  4. Validation: Verifies all vulnerabilities are fixed

Constitutional Principles

Grapple supports domain-specific critique principles via --principles:

Principle Focus
general Overall code quality (default)
security OWASP Top 10, secure coding
performance N+1 queries, caching, async I/O
maintainability Clean code, testability, SOLID

Auto-Routing

The auto command now detects crossfire intents:

./scripts/orchestrate.sh auto "security audit the auth module"  # → squeeze
./scripts/orchestrate.sh auto "have both models debate the design"  # → grapple

Cost Estimate

Command Agent Calls Estimated Cost
grapple 5 ~$0.15-0.30
squeeze 4 ~$0.12-0.25

Both are more expensive than single-agent but catch 2-3x more issues.


Full Changelog: v4.6.0...v4.7.0

Don't miss a new claude-octopus release

NewReleases is sending notifications on new releases.