manuelschipper/nah v0.10.0 on GitHub

Removed

The optional LLM layer is reduced to a single classify-unknown job (nah-1010).
Removed the LLM ask-refinement / Layer-2 intent relaxer (the cite-or-ask
ask → allow path, its tiered risk veto, and every per-action relax opt-in),
the visible inline lang_exec LLM review, the transcript-reading prompt context,
and the llm.eligible / llm.deny_limit / llm_risks.py machinery. The optional
LLM (still off by default; llm.mode: on) can no longer relax a known ask,
review inline code, or read your conversation — it only classifies unknowns
(see Added). Claude and Codex share this one path.
Removed the LLM write content-review gate (nah-997) that inspected
Write/Edit/MultiEdit/NotebookEdit and Codex apply_patch payloads as data-at-rest
and could escalate a clean allow to ask. Write-like tools are now guarded by
the deterministic floor only — sensitive-path block, project-boundary, and
destructive-patch checks — which is cheap, clear, and unchanged.
Removed the session taint tracking and provenance features entirely
(src/nah/taint.py, src/nah/provenance.py) along with all runtime wiring
(Claude hook.py, Codex codex_hooks.py/codex_run.py, terminal guard), the
taint/provenance config surface, the LLM provenance-review path, and the
log/message rendering and docs (nah-1009). Both were opt-in and off by default,
so removal is behavior-neutral for current users; the deterministic classifier,
LLM classify-unknown path, and the 43 action types are unchanged. The non-headless
Codex PreToolUse hook is now fully observation-inert (its only job was taint
state); enforcement still happens at PermissionRequest.
Removed deterministic secret-looking and credential-path content scanning, along with
secret redaction on LLM prompt/transcript context and local post-tool error summaries.
Secret protection now relies on structural controls such as sensitive paths,
credential-search detection, and explicit secret-store/env reads
rather than guessing token-shaped text in write payloads (nah-1006).
Removed the /nah-demo Claude Code showcase and its curated cases
(src/nah/demo_cases.py, src/nah/data/nah_demo.json, the .claude/commands/nah-demo.md
slash command, and tests/test_nah_demo.py). It was a product demo, not part of the
guard or the regression suite; pytest remains the coverage source and
nah audit-threat-model the coverage report.

Added

Optional LLM classify-unknown (nah-982, nah-994). When the deterministic
classifier returns unknown for a Bash command, the optional LLM (still off by
default; llm.mode: on) maps it to a built-in action type and the kind-tagged
targets it touches. The mapped type re-enters the normal policy machinery and
each surfaced path/host target is re-checked through the same deterministic
floor (sensitive paths, project boundary, known hosts): the LLM extracts, the
floor matches. db/container targets have no faithfully-mirrorable floor (the
real db/container floors are policy-/cwd-/exec-specific), so they stay
unverifiable and the mapped type's policy decides — allow-policy safe reads
clear, context-policy execs ask (nah-994). A read of ~/.ssh is never
auto-allowed; an unverifiable target falls back to ask; an obfuscated unknown can
tighten to block. Fail-closed, process-cached, and command-only (no transcript).
entry["llm"] records the classify pass with a top-level action_type_source
(deterministic|llm_classify) and a new nah log --classified filter;
nah test shows the classification and per-target floor verdicts.
Flag-aware env_read classification for shell builtins, ps, and caddy fmt (nah-1005).
Follow-up to nah-1004 covering the cases a static prefix table can't express because the
safe and unsafe forms are the same command split by flags:
- bare env (no inner command), bare set, and bare/-p export/declare/typeset →
  env_read (ask), while their assignment, option (set -x), and exec-wrapper
  (env FOO=bar cmd) forms keep their existing classification.
- ps with the BSD environment modifier (ps e, ps eww, ps auxe) → env_read, while
  SysV ps -e/-ef (all processes) and value-flag forms (ps -u <user>, ps -o pid,etime) correctly stay filesystem_read — the classifier is value-flag-aware to
  avoid false positives.
- caddy fmt --overwrite → filesystem_write; bare caddy fmt → filesystem_read.
- Removes the now-redundant static export -p/declare -p/typeset -p entries from the
  env_read table (the builtin classifier owns them).
service_inspect and env_read action types; service_read narrowed to remote (nah-1004).
service_read was overloaded: its static table was 100% local daemon inspection
(systemctl status, journalctl) while every remote API read (curl GET, gRPC,
GraphQL) was classified dynamically, so its single context policy fit only the
remote half and the audit label ("remote API state") was wrong for the local half.
- service_inspect (policy allow) is the honest home for local service/daemon
  inspection — the systemd entries move here, joined by caddy version/list-modules,
  launchctl list/print, sc query/queryex/qc, rc-status/rc-service -l, and
  service --status-all. It is deliberately kept out of the data-egress
  boundary (local inspection is not network egress).
- env_read (policy ask) is the honest home for commands whose purpose is
  exposing environment or secret values — printenv, caddy environ,
  systemctl show-environment, export -p/declare -p/typeset -p, and secret-store
  reads (vault read/kv get, aws secretsmanager get-secret-value,
  aws ssm get-parameter, gcloud secrets versions access, az keyvault secret show,
  kubectl get/describe secret, pass show, op read/item get, bw get,
  heroku config, doppler secrets, infisical secrets, chamber read/export,
  sops -d). These were previously unknown → ask, which lied in the audit log and
  fired a wasted LLM classify on every invocation. systemctl show-environment moves
  from a silent service_read → allow to an honest env_read → ask. Name-only listers
  (gh secret list, etc.) are intentionally excluded; secret-injecting exec wrappers
  (op run, doppler run, aws-vault exec) stay on the exec path. Flag-dependent
  forms (bare env/set/export, ps env-flags, caddy fmt --overwrite) are
  deferred to a follow-up (nah-1005). Also classifies crontab -l and caddy validate
  as filesystem_read.
talosctl global flag stripping before subcommand classification — talosctl -n <ip> get routes, talosctl --nodes=<ip> dmesg, and other talosctl commands that carry connection global flags (-n/--nodes, -e/--endpoints, -c/--cluster, --context, --talosconfig) now strip those flags before the global-table prefix match instead of falling through to unknown. Mirrors the kubectl/flux idiom and fails closed: unknown or malformed pre-subcommand flags stay on the unknown ask path, and dangerous subcommands such as talosctl reboot/talosctl reset still classify as configured. Closes #86; PR #89 by @srgvg.
flux global flag stripping before subcommand classification — flux -n <ns> get kustomizations, flux --namespace=<ns> list, and other flux commands that carry kubeconfig-style global flags (-n/--namespace, --context, --kubeconfig, --timeout, --token, ...) now strip those flags before the global-table prefix match instead of falling through to unknown. Mirrors the kubectl/talosctl idiom and fails closed: unknown or malformed pre-subcommand flags stay on the unknown ask path, and destructive subcommands such as flux delete/flux uninstall still classify as configured. Closes #87; PR #90 by @srgvg.
Codex hook-timeout probe — nah run codex --probe[=DELAY] arms a
debug-only stall in nah's Codex hooks (gated behind NAH_HOOK_PROBE, capped
at 60s, verdict unchanged) so you can observe the timeout Codex actually
enforces. nah run codex --measure-hook-timeout drives Codex with the probe
and reports enforced-vs-configured timeouts, defaulting to PostToolUse (the
only event that both fires and is enforced under headless codex exec).
Documented in the CLI reference.

Changed

Terminal Guard is deterministic-only (nah-985). The interactive bash/zsh
terminal guard has no LLM step. A command you type directly into your shell is
already your own intent, so there is no agent transcript to mine — the guard
classifies to allow / ask / block and an ask is confirmed inline at the
prompt. The shared llm.mode and targets.bash.llm.mode / targets.zsh.llm.mode
knobs are still accepted for backward compatibility but no longer affect terminal
decisions.
Container write taxonomy split by verifiable risk axis (nah-996).
container_write is replaced by container_lifecycle and
container_build. Lifecycle operations that act on named containers
(docker stop api, podman restart worker) are context policy and use
trusted_containers: every flag-free identity must be trusted, while flags,
dynamic names, and compose lifecycle commands ask. Build/image/infra commands
(docker build, docker compose build, docker network create) are
container_build with default allow and no cwd gate; autonomous presets can
tighten it with actions: {container_build: block}. Legacy
container_write in actions: fans out to both new types, classify: maps
to conservative container_lifecycle, and interactive allow/deny/
classify/forget commands now ask users to choose one of the new types.
Database taxonomy gates SQL-exec capability, not SQL intent (nah-995).
Replaces db_read/db_write with db_safe/db_exec: structurally-safe
database surfaces such as dolt log/status/diff/branch and Supabase list/get
tools are db_safe (allow), while tools that can run caller-supplied SQL
are db_exec (context) and continue to use db_targets for target-scoped
allow. The old db_read/db_write config names are accepted as deprecated
aliases and canonicalized with a one-time warning. The previous
sqlite3 -readonly and PGOPTIONS/psql -X read-only special cases are
removed; those invocations now classify as db_exec and ask unless
db_targets allows the target.
Layer 1 classifies into built-in types only (nah-992). The classify-unknown
pass is not offered the user's custom action types — it maps into the built-in
taxonomy only. This stops the model from collapsing a whole unknown compound into
a trusted custom allow type (e.g. a cd repo && molds … && molds wontdo …
block landing on a custom molds_safe → allow). A custom type the model names
anyway is coerced to unknown, so the deterministic ask stands.
Codex lifecycle commands normalized to nah <command> codex (nah-960).
nah status codex (read-only preflight), a new top-level nah setup codex,
and nah uninstall codex now match the install/status shape used by every
other runtime; nah run codex is unchanged. Breaking: the old
nah codex doctor / nah codex setup / nah codex remove-setup subcommands
are removed (no aliases) and exit nonzero — use nah status codex /
nah setup codex / nah uninstall codex instead. nah status codex also
fixes a silent no-op (it used to parse and exit 0 with no output) and is
strictly read-only: it reports missing or stale rules and exits nonzero
without creating them. nah doctor codex and nah doctor claude now point to
nah status …. The hook-timeout probe moved from nah codex measure-hook-timeout to the nah run codex --measure-hook-timeout debug mode.

Fixed

nah test dry-runs no longer self-flag on sensitive paths in their arguments
(nah-qb3). A nah test invocation whose arguments named a sensitive path as a
bareword or flag value (e.g. nah test --tool Read ~/.ssh/id_rsa) was flagged by
nah's own hook as a real sensitive access and paused for approval, even though
nah test is a pure dry-run classifier with no filesystem or execution side
effects. The _classify_nah_cli classifier now recognizes nah test and allows
it without scanning its argument tokens for sensitive paths. Output redirections
(caught by the redirect guard) and command/process substitutions (classified
independently upstream) stay guarded, and the exemption is exact-match and
stage-local, so adjacent stages like nah test foo && rm -rf ~/.ssh are unaffected.