github manuelschipper/nah v0.6.0

9 hours ago

Added

  • Codex and Codex companion taxonomy — added agent action types plus Phase 2 classification for Codex CLI and Codex companion commands, including read-only metadata, write/state changes, local/remote agent execution, server startup, and bypass-flag escalation (mold-15)
  • Threat-model coverage audit — added nah audit-threat-model CLI subcommand backed by src/nah/audit_threat_model.py, with module-level rule tests, TestContainerDestructiveCoverage, and TestPackageEscalationCoverage so threat-model claims can be mapped back to concrete pytest coverage and the container/package escalation gaps are exercised explicitly. Output formats: markdown (default), json, summary (mold-8)
  • Playwright MCP browser taxonomy expansion — added 6 new action types: browser_read, browser_interact, browser_state, browser_navigate, browser_exec, and browser_file. Bundled classification now covers both mcp__plugin_playwright_playwright__browser_* and mcp__playwright__browser_* tool names, eliminating prompts for the 58 read/interact/state tools while keeping navigate/exec/file tools on explicit ask paths with browser-specific reasons (mold-10)
  • Container + systemd taxonomy expansion — added 6 new action types: container_read, container_write, container_exec, service_read, service_write, and service_destructive. Full-profile docker/podman coverage now includes logs/inspect/stats/build/exec/compose/service flows, systemctl/journalctl no longer fall through to unknown, minimal profile gains read-only container/service coverage, and sensitive path defaults now cover Docker daemon and systemd config/socket paths (mold-2)
  • Unified LLM mode — merged 4 fragmented LLM entry points into 2 clean paths. Path 1 (ask refinement): combined safety+intent prompt runs in main() for ask decisions, uses user-only transcript and CLAUDE.md for context, can only relax ask→allow. Path 2 (content veto): stays in handlers for write/script inspection, hard-capped to ask. Config simplified to llm.mode: off|on (one switch). LLM can never block — only allow or ask. Session state tracks consecutive denials (3→disable). nah log --llm filter, nah test uses unified path. Backward compat: llm.enabled: true still works. Deprecation warning for removed llm.max_decision (nah-5no)
  • Inline code inspectionpython3 -c 'print(1)', node -e, ruby -e, perl -e, php -r inline code is now content-scanned instead of blindly prompting. Safe inline → allow, dangerous patterns → ask/block. LLM veto gate fires on clean inline code (same defense-in-depth as script files). LLM prompt now includes inline code for enrichment (nah-koi.1)
  • Shell init file protection~/.bashrc, ~/.zshrc, ~/.bash_profile, ~/.zshenv, ~/.bash_aliases, and 8 more shell init files now guarded as sensitive paths (ask policy). Prevents silent alias injection persistence. Includes .bashrc.d/ and .zshrc.d/ directories (nah-wdd)
  • Safety list hardening — expanded coverage for credential directories (~/.kube, ~/.docker, ~/.config/az, ~/.config/heroku), sensitive basenames (.pgpass, .boto, terraform.tfvars), exec sinks (lua, R, Rscript, make, julia, swift), and decode-to-exec pipe detection (gzip -d, zcat, bzip2 -d, openssl enc, unzip -p, and more) (nah-brq)

Removed

  • Beads taxonomy — removed beads_safe, beads_write, and beads_destructive action types plus all bd classify entries and bd dolt start/stop/killall process_signal entries. The beads CLI (bd) is superseded by molds; users who classified molds commands under beads types should reclassify under generic types (filesystem_read, filesystem_write, filesystem_delete).

Changed

  • Public docs readiness — refreshed README and site docs for the current guarded tool surface, LLM configuration/mechanics, database target behavior, safety-list defaults, profile counts, and nah test --tool support.
  • LLM reasoning observability — LLM responses now carry both a short prompt-safe reasoning summary and a longer reasoning_long explanation for logs and nah test, while Claude-visible prompts continue to use the compact summary.
  • Write/Edit LLM review mechanics — Write/Edit, MultiEdit, and NotebookEdit LLM handling can now relax eligible project-boundary asks to allow when the edit is narrow, safe, and clearly intended, while still escalating risky deterministic allows to ask and keeping sensitive/config/content-pattern asks human-gated (nah-858)
  • LLM eligibility presetsllm.eligible: strict preserves the old conservative default, default now includes unknown, lang_exec, non-sensitive context, package_uninstall, container_exec, and browser_exec, and all remains the opt-in route for every ask decision. Classified fallback/MCP tools now include stage metadata so taxonomy eligibility applies consistently (nah-856)
  • GitHub Actions now publishes a non-gating threat-model coverage report to the job summary after the main pytest run, so PRs show per-category audit counts without changing the enforcement gate (pytest tests/) (mold-8)
  • Docker and podman read-only inspection commands like ps, images, logs, inspect, and compose read ops now classify as container_read instead of filesystem_read. Default behavior stays allow; logs and nah types now use the container-specific action type.
  • Transcript-derived LLM context now reformats slash-command skill invocations, labels Claude Code skill meta blocks as Skill expansion, deduplicates repeated expansions by skill name, and caps each captured skill body to 2048 chars (mold-3)

Fixed

  • Codex companion script variables — same-command discovery patterns like CODEX_SCRIPT=$(ls ~/.claude/plugins/cache/openai-codex/codex/*/scripts/codex-companion.mjs | head -1) && node "$CODEX_SCRIPT" ... now classify as Codex companion delegation instead of generic missing-script lang_exec asks (nah-859)
  • Benign export NAME=value assignmentsexport PATH=/opt/bin:$PATH and similar assignment-only shell stages now classify as benign environment setup instead of unknown, while exec-sink values, substitutions, redirects, and non-assignment export forms still take the stricter existing paths (nah-862)
  • Shell source classificationsource <file> and POSIX . <file> now classify as lang_exec and use the existing script path/content inspection path instead of falling through to unknown (nah-860)
  • Subshell group parsing — parenthesized command groups such as cmd || (brew list ...; ls ...) 2>&1 now classify by their inner commands, preserve group redirects, fail closed for grouped pipes, and no longer suggest invalid nah classify (cmd <type> hints (nah-861)
  • Sudo wrapper classificationsudo-wrapped Bash commands now unwrap to the inner action type with a sudo: reason prefix, preserving targeted hints, redirect/content inspection, trust_project passthrough behavior, composition rules, and fail-closed parsing for unsupported or malformed sudo options (mold-12)
  • Heredoc apostrophes inside $() no longer false-block as "unbalanced substitution"_match_parens and _extract_substitutions now recognize <<EOF heredoc operators (and <<-EOF, <<'EOF', <<"EOF" variants) and skip past their bodies as opaque literal content. A new _strip_heredoc_bodies helper removes heredoc bodies before shlex.split so the inner stage is shlex-friendly even when the body contains unbalanced apostrophes, backticks, or parens. This unblocks the Claude Code git-commit pattern git commit -m "$(cat <<EOF\n…can't…\nEOF\n)" which was previously hard-blocked any time the commit body contained a contraction (mold-9)
  • lang_exec veto silently ignored — when the LLM flagged a script as dangerous, max_decision cap converted block→ask, then the veto check (== block) failed, silently allowing the script. Now escalates to ask unconditionally when the LLM flags concern (nah-5no)
  • LLM decision always empty in logs_build_llm_meta() never set the llm_decision field, so every log entry had "decision": "" in the llm block. Now populated from the actual LLM response (nah-5no)
  • SSH-style host extraction now covers rsync and ssh-copy-idrsync user@host:path and rsync host::module now resolve the remote host correctly for network context, and ssh-copy-id is classified as network_outbound with SSH host extraction instead of falling through to unknown or malformed URL parsing (nah-vcz)
  • Heredoc input classificationpython3 << 'EOF' ... EOF no longer produces "script not found" errors. Heredoc-fed interpreters are now classified as lang_exec with content scanning via heredoc_literal. Semicolons, pipes, and && inside heredoc bodies no longer cause false stage splits. Works for all interpreters: python3, node, ruby, perl, php (nah-dhs)
  • Shell comment parsing# comment lines with apostrophes (e.g. # Check if there's a fix) no longer cause shlex parse errors. Layer 1 skips quote tracking inside comments in _split_on_operators; Layer 2 retries shlex.split with comments=True on ValueError. Pure-comment commands correctly classify as empty/allow (nah-2zt)
  • LLM cascade failure no longer overrides deterministic allow — when all LLM providers fail (missing API keys, network errors), Write/Edit now returns the deterministic decision instead of escalating to ask. Previously every edit prompted for confirmation even when content was safe and path was trusted. Cascade metadata preserved in logs for debugging (nah-yt9)
  • LLM observability for write-like tools — LLM metadata (provider, model, latency, reasoning) now always logged for Write/Edit/NotebookEdit/MultiEdit, even when LLM agrees with the deterministic decision or all providers fail. Missing API keys now logged to stderr (nah: LLM: OPENROUTER_API_KEY not set) and to the structured log with provider: (none) and cascade errors. Previously missing keys caused silent 34ms "uncertain — human review needed" with no trace of why
  • String-content transcript messages are no longer dropped, so slash-command invocations and other non-list transcript entries now reach the LLM context formatter (mold-3)
  • LLM transcript tail reads no longer lose all context on giant JSONL lines_read_transcript_tail() now walks backward from EOF in newline-aligned chunks with a safety cap, so large tool_result lines no longer consume the entire read window and produce (not available) conversation context in LLM prompts (mold-27)
  • Inspectable wrapper execution no longer slips through package_runuv run, uvx / uv tool run, npx, and npm exec now re-route inspectable local code execution into lang_exec, while make / gmake execution paths also route to lang_exec via Makefile resolution. Read-only make forms remain filesystem_read, and ordinary package-run fallthroughs stay unchanged (nah-vhy)
  • Env-only shell stages no longer default to unknown -> ask — stages made entirely of NAME=value assignments now classify from an allow floor unless an env value is itself an exec sink or a substitution inner is stricter, so benign cases like TOKEN=abc123 and FOO=$(printf ok) no longer prompt spuriously (mold-17)
  • npm create no longer falls through to unknown -> asknpm create ... is now classified as package_run, matching the existing pnpm create, yarn create, and bun create scaffolding behavior so common forms like npm create vite@latest no longer prompt unnecessarily (mold-4)

Don't miss a new nah release

NewReleases is sending notifications on new releases.