github yvgude/lean-ctx v3.9.0

one hour ago

Changed

  • The shell hook is now transparent in plain human terminals: default
    activation is agents-only (GH #699).
    With the old always default the
    hook aliased git/docker/kubectl in every interactive shell — so a human in
    a plain terminal (no agent anywhere) saw lean-ctx allowlist diagnostics for
    their own commands. lean-ctx exists to save agent tokens; the aliases now
    auto-activate only when an agent session is detected (LEAN_CTX_AGENT,
    CURSOR_AGENT — newly recognized across every guard — CLAUDECODE,
    CODEBUDDY, CODEX_CLI_SESSION, GEMINI_SESSION). Set
    shell_activation = "always" (or LEAN_CTX_SHELL_ACTIVATION=always) to
    keep the old behavior, e.g. to feed your own shell usage into
    lean-ctx wrapped; lean-ctx-on still opts a single session in manually.
    The "[CLI] Command would be blocked in MCP mode" allowlist diagnostic is
    also downgraded to debug level for interactive TTY callers — it's agent
    telemetry, not human feedback. Thanks @DerPate for the precise report.

Added

  • /v1/compress is wire-compatible with LiteLLM's prompt-compression
    guardrail (GH #700).
    LiteLLM ≥ v1.92 can call a compression sidecar during
    pre_call (guardrail: headroom); the response now carries the
    tokens_before / tokens_after / compression_ratio telemetry fields that
    guardrail logs, alongside the existing richer stats block. Point the
    guardrail's api_base at the lean-ctx daemon and every request through a
    LiteLLM gateway is compressed deterministically (prompt-cache-safe, #498) —
    no client change, including Claude Code via ANTHROPIC_BASE_URL. Cookbook:
    docs/guides/compress-sdk.md.
  • Provider-verified savings receipts (GH #701, opt-in
    proxy.counterfactual_metering).
    Wire savings were estimated (bytes/4 or
    local tokenizer). With metering on, every request the proxy rewrites also
    fires Anthropic's free count_tokens endpoint with the original,
    uncompressed body — concurrently with the real forward, spawned detached so
    it can never delay, mutate or fail the request — and pairs the
    provider-counted "would have billed N" with the same response's actually
    billed usage. Same request, same moment: no traffic-mix confound
    (methodology adopted from pxpipe's counterfactual metering). /status gains
    a verified_savings block and lean-ctx proxy status a Verified: line
    beside the estimate; per-model pairs persist across restarts in
    proxy_usage.json (pre-#701 files load unchanged). Net-negative results are
    reported signed, never clamped. Anthropic only (no free counting endpoint
    elsewhere); probe failures silently degrade the row to the estimate.
  • CCR round-trips through LiteLLM's agentic loop (GH #702). A lossy
    /v1/compress rewrite now advertises its retrieval hash in the guardrail's
    regex-locked hash=<24-hex> form, and the new GET /v1/retrieve/{hash}
    endpoint resolves it from the content-addressed tee store
    ({"original_content": …}). LiteLLM (BerriAI/litellm#31681) injects its
    retrieve tool on seeing the marker, validates the hash per call id, and
    replays the model with the verbatim original — compression behind a LiteLLM
    gateway is reversible end-to-end, with zero lean-ctx-specific client code.
    The marker shape is pinned by a contract test so drift fails CI; the hash is
    a pure function of the content, so stubs stay byte-stable (#498). The
    existing local handles (<lc_expand:…>, tee paths, /v1/references/{id})
    are unchanged.
  • Persistent per-extension grammar telemetry (GH #690 Phase 2 groundwork).
    The tiering cut needs to know which of the ~27 static tree-sitter grammars
    actually earn their binary bytes, but the only signal was a pair of
    process-lifetime counters with no language dimension (flagged by @getappz).
    core/grammar_usage now records tree-sitter vs regex-fallback hits per file
    extension, persisted across sessions in grammar_usage.json (aggregate
    counters only — no paths or project data). ctx_metrics shows the all-time
    top extensions in its SIGNATURE BACKEND section.

Fixed

  • Multi-window MCP starts can no longer trip the crash-loop backoff
    (GH #694 follow-up — thanks @ITFinesse).
    The crash-loop guard counts
    server starts in a 60s window, but a healthy burst — N editor windows each
    spawning a server, plus the client's own retries while a slow host
    initializes — could cross the threshold with zero crashes. The resulting
    pre-handshake backoff sleep (up to 30s) then caused the very
    "Waiting for server to respond to initialize request" timeouts it exists
    to prevent, wedging the second window. A completed MCP handshake now clears
    the start history (a handshake proves binary + config are healthy; true
    crash loops die before it), so only genuinely crashing servers back off.
  • VS Code Insiders is now a first-class MCP target (GH #694 follow-up —
    thanks @ITFinesse).
    Insiders keeps a fully separate profile dir
    (Code - Insiders/User), so registering lean-ctx in stable's
    Code/User/mcp.json left Insiders with an empty MCP: Open User Configuration — exactly the "server missing in one window" confusion from
    the multi-window report. setup/init now detect and write the dedicated
    Insiders config on all platforms (agent key vscode-insiders), doctor
    lists it as its own MCP location, and uninstall cleans it up.
  • Grammar-addon dylibs refuse to load from world-writable dirs/files
    (GH #690 review point 3, PR #697 — thanks @getappz).
    A group/other-
    writable grammar dir would let any local account swap the dylib between
    hash check and dlopen; the loader now rejects that layout outright.
  • ctx_read gains repo param parity in multi-repo mode (GH #696,
    PR #698 — thanks @getappz).
    ctx_search/ctx_glob/ctx_tree could
    already target a registered root via repo=<alias>, but ctx_read could
    not — you could find a file in another root yet not read it. Read-only by
    design (ctx_edit/ctx_patch stay session-rooted until undo history is
    multi-repo-aware); unknown aliases error with the list of known ones, and
    jail + secret screening apply against the resolved repo root.
  • A corrupt stats.json is quarantined, never silently reset (GH #706
    thanks @getappz).
    A crash mid-write (or disk-full) could leave truncated
    JSON; the loader's unwrap_or_default() then wiped months of savings
    history without a trace on the next write. Unparseable stats now move to
    stats.json.corrupt (one warning log; the file is evidence and stays
    recoverable by hand), and doctor reports the quarantine with recovery
    guidance instead of everyone silently starting from zero.
  • Relative paths follow a mid-session worktree switch (GH #707 — thanks
    @getappz).
    project_root is captured once at MCP initialize; when the
    client later enters a git worktree (Claude Code EnterWorktree nests a
    full checkout under .claude/worktrees/<n>/), every relative path kept
    resolving into the stale root — silently, because the same layout exists
    in both trees. Resolution now walks both shell_cwd and project_root up
    to their nearest .git entry (dir or worktree file); when the boundaries
    differ, the live shell_cwd wins. A plain cd rust/ inside the same
    checkout shares the boundary and is untouched, and a shell_cwd with no
    git upward gives no signal — so the monorepo behavior stays exactly as
    before.
  • ctx_read raw mode no longer swallows markdown table delimiters
    (GH #709 — thanks @getappz).
    The output sanitizer's symbol-flood guard
    (meant for degenerate model output like @@@@@@…) also matched legitimate
    document structure — |----|----| delimiter rows, ====/---- setext
    underlines and HR lines vanished from raw reads, breaking the mode's
    byte-fidelity contract. Structural characters no longer count toward the
    flood check, and a removed flood line no longer eats the file's trailing
    newline.
  • ctx_shell's explicit cwd param now updates the live shell cwd
    (GH #707 follow-up).
    The worktree-divergence detection reads
    session.shell_cwd, but that field only tracked cd commands inside
    command text — clients that switch checkouts pass the new directory as the
    cwd argument of every call, so the switch was invisible to path
    resolution. A jail-accepted explicit cwd is now persisted, verified
    end-to-end over a real MCP session (read resolves into the worktree copy
    after ctx_shell cwd=<worktree>).
  • lean-ctx stop/dev-install no longer SIGTERM their own process tree
    (GH #714).
    Run under the lean-ctx shell wrapper (lean-ctx -c … → sh → lean-ctx dev-install), the process sweep matched the wrapper parent and
    killed the pipeline mid-install (exit 143) — after the binary swap but
    before autostart was re-enabled. The sweep now excludes the full
    ps ppid ancestor chain and every member of its own foreground process
    group — agent harnesses (Cursor's shell) reparent intermediaries to PID 1
    mid-run, which broke the ancestor walk alone; the group covers the wrapper
    regardless of reparenting. Verified: dev-install under the Cursor agent
    shell now completes end-to-end, including autostart re-enable.
  • Unknown MCP tool names now suggest the nearest registered tool
    (GH #712 — thanks @getappz).
    ctx_serach returned a bare "Unknown tool"
    while the CLI has long offered "did you mean" for typos; the
    Levenshtein suggester is now shared (core::levenshtein) and the MCP
    dispatch error appends "— did you mean 'ctx_search'?" within a
    length-scaled edit budget, so agents self-correct in one turn instead of
    falling back to native tools.

Added

  • Portable hook binary for synced agent configs (GH #708,
    hook_binary / LEAN_CTX_HOOK_BINARY).
    Generated hook commands bake
    the machine-absolute binary path (#367: agent hosts run hooks without your
    PATH). If you sync ~/.claude/settings.json between machines with
    different usernames, that absolute path is wrong on every other machine —
    and re-running init/doctor --fix there rewrites the file, ping-ponging
    your sync forever. Setting hook_binary = "$HOME/.local/bin/lean-ctx"
    (config) or LEAN_CTX_HOOK_BINARY (env) emits that expression verbatim
    into every shell-executed hook command — the hook host's shell expands it
    at run time — and doctor accepts it as current, ending the rewrite
    cycle. MCP server registrations and launchd/systemd autostart units keep
    the real absolute path: nothing expands variables there.
  • The AI Gateway (team mode). The engine can now run as a shared
    org gateway — one deployment your whole team points its IDEs at, with
    per-person attribution, governance and audited savings. Compiled into the
    default binary (gateway-server feature), local-free invariant intact:
    nothing changes for solo use until you run it.
    • lean-ctx gateway serve — multi-provider reverse proxy
      (Anthropic / OpenAI / Gemini / Ollama / custom registry) with per-person
      bearer keys, usage metering to Postgres (usage_events), wire-shape
      translation (an Anthropic-speaking IDE can call an OpenAI-hosted model and
      vice versa) and a token-protected admin console on a separate port.
    • lean-ctx gateway init — plug-and-play scaffold: docker compose,
      .env, key file and a step-by-step README in one command;
      gateway doctor preflights config, secrets, DB and ports.
    • lean-ctx gateway keys add|list|rotate|revoke — key lifecycle
      without storing plaintext (SHA-256 hashes only, shown once).
      rotate (GL enterprise#67) replaces every key of a person in one atomic
      file swap — no window where the person has zero valid keys — and keeps
      team/project attribution.
    • GET /v1/models (GL enterprise#63) — the curated org model catalog
      from [proxy.routing.aliases], content-negotiated: OpenAI-shape and
      Anthropic-shape clients each get their native list format. IDEs discover
      org names like zuehlke/fast; the gateway resolves the alias, injects
      upstream credentials and stamps routed_from into the ledger.
    • /me personal usage view (GL enterprise#64/#65) — each person signs
      in with their own gateway key and sees exactly their spend, savings,
      trend, models and projects — never anyone else's. Dark/light, 24h–90d
      windows, savings-share KPI.
    • Signed org-policy gates (GL enterprise#25/#66) — under a signed,
      pinned, enforced = true org policy the forward path refuses:
      models outside the [routing].allowed_models ceiling (403), spend above
      [budgets] caps per person/UTC-day or project/UTC-month (429), and — new —
      requests beyond [budgets].max_requests_per_minute_per_person (429 with
      an honest Retry-After of the seconds until the minute rolls). Errors
      arrive in the caller's wire shape; refusals are counted on
      leanctx_policy_blocked_total{reason="model_ceiling"|"budget"|"rate_limit"}.
      Without an enforced org policy every gate is a no-op.
    • Evidence & GDPR (GL enterprise#36/#39) — usage retention windows,
      Ed25519-signed evidence exports (gateway evidence / evidence verify),
      person-scoped gateway gdpr export|delete, and Blake3 pseudonymization
      for person identifiers at rest.
  • Multi-window visibility (GH #694). lean-ctx doctor no longer claims
    "no active session" when sessions exist for other workspaces: run from a
    directory that isn't an open project root it now reports
    none for this directory — recent: frontend (4m ago), backend (1h ago),
    naming every workspace with a live session. The dashboard overview gains a
    "Connected workspaces" panel (new /api/workspaces endpoint) listing each
    project with status (active <10 min, idle <24 h, stale), last activity,
    tokens saved and current task — shown as soon as two or more workspaces
    have sessions.

Added

  • Grammar addons: long-tail tree-sitter grammars as signed runtime dylibs
    (GH #690 Phase 1, PR #695 — thanks @getappz).
    Structural understanding no
    longer has to be compiled in: an extension not covered by the 27 built-in
    grammars can now resolve through a SHA-256-pinned, per-platform grammar
    dylib that is dlopen'd at runtime — manifest + curated registry
    (data/grammar_registry.json, user-overridable under the same signed-
    override policy as the addon registry), a loader that verifies the hash pin
    on every load plus the tree-sitter ABI version before handing the
    grammar to the parser, a five-platform CI build matrix, and a zero-config
    fetch on first use. Fully offline-safe: no addon installed (or no network,
    or addons.policy = locked, or the new addons.grammar_auto_fetch = false
    for strict-egress orgs) degrades to the regex-signature fallback exactly as
    before. Installed dylibs land read-only and ad-hoc-signed on macOS; every
    fetch is logged with its source URL. The registry ships empty — which of
    the 27 static grammars (if any) move to the addon tier is a separate,
    telemetry-gated Phase 2 decision.

Changed

  • The heredoc-to-interpreter refusal now hands the agent the recovery path
    (GL #1161).
    Policy review outcome: the block stays — inline code embedded
    in the command string never exists as an inspectable artifact, while a
    script file passes the write path's own guards and leaves an audit trail.
    But the old message ("Use a script file instead") left agents rediscovering
    the workaround by trial and error; the refusal now spells it out: write the
    code to a file (Write/ctx_edit), then python3 /tmp/snippet.

Fixed

  • A transient roots/list failure no longer disables project-root detection
    for the whole MCP session (GH #694).
    The first tool call resolves client
    roots exactly once; when that single attempt failed (e.g. the IDE window was
    still starting up — the VS Code second-window pattern), the server never
    asked again and fell back to cwd guessing for the session's lifetime. Failed
    attempts now re-arm resolution for up to 3 tries; a -32601 Method not found
    (client declares the capability but doesn't implement it — Cursor) still
    gives up immediately, and roots/list_changed restores the retry budget.
  • dev-install on Windows no longer hard-fails with ACCESS_DENIED while an
    IDE holds the old binary open (GH #691).
    The final swap did a bare
    replace-rename, which Windows refuses for as long as any process runs the old
    image — and dev-install deliberately never kills the IDE-owned MCP server
    (#1036), so no retry budget could ever succeed (measured: identical failure
    after 60 s). The install now uses the rustup-style sidecar swap: the running
    binary is renamed aside to lean-ctx.old.exe (allowed for mapped images),
    the fresh binary lands at the real path, and the sidecar is reclaimed on the
    next install once its holder exited. If even the rename-aside is blocked
    (AV/EDR-style zero-sharing lock), the error now explains the cause and the
    fix instead of a bare OS error code. Thanks @getappz for the measurement
    work in #691/#692.
  • ctx_share handovers with org agent ids (team:alice) are now pullable on
    Windows.
    The share filename embedded the agent id verbatim; NTFS interprets
    : as an Alternate Data Stream, so the write "succeeded" but the file never
    appeared in the store — the receiving agent saw "No shared contexts for you".
    Filenames now use a filesystem-safe slug ([A-Za-z0-9._-], everything else
    -); the true agent id still lives inside the JSON payload.
  • Background knowledge writers can no longer clobber facts a parallel
    remember just committed (lost-update, #326 class).
    The consolidation
    pipeline (apply_artifacts_to_stores) and the gateway memory adapter
    (addon_memory ingest) both did load → modify → blind save() from a
    background thread; a fact committed between their load and save was silently
    dropped — surfacing as flaky "no current fact exists" errors on
    ctx_knowledge relate right after a successful remember. Both writers now
    go through ProjectKnowledge::mutate_locked like every other writer.
  • CI: three timing/environment flakes hardened. The
    session_lock_timeout prompt-timeout bounds (400 ms) fired falsely on loaded
    Windows runners — the assertion only distinguishes "timed out" from "hung",
    so the bound is now 5 s; the lock-ordering check now skips #[cfg(test)]-gated
    statics (test-only locks need no production lock-ordering documentation); the
    two production gateway locks from enterprise#25 (SNAPSHOT, LEDGER) are
    documented in LOCK_ORDERING.md (L58/L59).
  • max_ram_percent is now actually enforced under Cursor/MCP load — no more
    75 GB OOM-kill-respawn cycles (GH #685).
    Two compounding gaps, both closed:
    Uncontrolled build growth: the parallel BM25/graph index builds fanned the
    whole corpus across the rayon pool in one shot — on a 1M+-file multi-root
    setup the transient build state outran the 3 s memory guardian straight into
    the kernel OOM killer. Builds now run in 2000-file batches with a guardian
    check between batches (order-preserving, so indexes stay byte-identical —
    equivalence-tested), a new admission gate (index_admission) degrades
    corpora whose estimated peak exceeds the RSS headroom to the sequential
    build up front, and extra workspace roots are indexed one at a time on a
    single supervisor thread instead of up to 8 concurrent graph+BM25 pairs.
    Eviction blind spots: the eviction orchestrator reasoned over session-cache
    token utilization, which cannot see the HNSW/ANN graph, the resident trigram
    search indexes or the materialized graph indexes — under Hard/Critical RSS
    pressure it could conclude "nothing to do" while those structures dominated
    RSS. RSS pressure now enforces a floor action (Hard ⇒ unload indices,
    Critical ⇒ emergency drop), and UnloadIndices/EmergencyDrop additionally
    clear the ANN cache (new ann_cache::clear() + memory_usage_bytes()), the
    resident search indexes (search_index::clear_resident()) and the graph
    cache. All evicted structures rebuild transparently on next use.
  • sed/awk file dumps are verbatim output — no more dictionary-mangled
    source (GH #688).
    A range-print like sed -n '10,50p' file.ps1 fell into
    the generic terse pipeline, whose dictionary layer word-substitutes code
    identifiers with no code-awareness (functionfn, returnret, bare
    else lines dropped) — corrupting code read via sed/awk instead of cat.
    sed/awk/gawk/mawk/nawk now classify as file viewers like
    cat/head/tail. In-place edits are excluded via a token-based flag check
    (-i, -i.bak, -ni clusters, --in-place[=suffix], gawk -i inplace) —
    deliberately NOT a substring match, so filenames like my-input.txt or
    data-import.csv can't silently re-enter the terse pipeline. Byte-exact
    regression test with the original PowerShell repro. Thanks @getappz for the
    report and the PR the fix is based on (#689).
  • setup no longer panics when a client's MCP-instructions cap lands inside
    a multi-byte character (GH #680).
    The Claude Code / CodeBuddy 2048-char
    truncation used a raw byte slice; when the cut fell inside an em-dash the
    whole setup crashed ("end byte index 2048 is not a char boundary",
    live-reported at setup level 3, step 3/13). The cut now backs up to the
    previous char boundary (truncate_instructions, unit-tested with the exact
    crash shape).
  • doctor no longer false-flags a working OpenCode install (GH #686).
    Two gaps: has_lean_ctx_mcp_entry only walked mcp.servers.lean-ctx, but
    OpenCode's schema (opencode.ai/config.json) nests servers DIRECTLY under
    mcp — the direct-child form is now recognized too; and OpenCode was absent
    from the SKILL.md candidate list (checked: ~/.config/opencode/skills/ lean-ctx/SKILL.md) — it is now both checked by doctor AND installed by
    install_all_skills when OpenCode is detected, so check and installer can't
    drift apart.
  • Anchored line-1 edits of UTF-8-BOM files no longer conflict forever
    (GH #683 follow-up).
    With ctx_read stripping the BOM (output honesty #683),
    the anchor hash the model holds for line 1 is over the BOM-less text — but
    ctx_patch validated anchors against the raw preimage, so the hashes could
    never match and every retry conflicted again. The edit side now validates
    against the same BOM-less view and re-prepends the BOM on write (the BOM is
    an encoding artifact of the file, not of the edit).
  • Shell allowlist no longer splits commands at backslash-escaped operators
    (GL #1160).
    In restricted (allowlisted) mode, rg -n split\.label\|foo src/
    was split at the escaped pipe, so the pattern fragment after it was validated
    — and blocked — as an unknown command (field report: rg dying with
    "not in the allowlist" on regex tokens, exit 126). The operator scanner,
    the subshell-paren walker and the substitution detector now honour bash
    backslash semantics outside single quotes: \|, \;, \&, \(, \) and
    \$( are data, never operators. Real (unescaped) pipes still split and
    every segment is still validated — over-blocking removed, deny-by-default
    unchanged. Also drops a dead pipe-index scanner from
    check_pipe_to_bare_interpreter.
  • Marked-block surgery no longer eats user content when a marker is quoted
    in prose (GL #1158).
    marked_block (and the Claude/CodeBuddy
    remove_block twin) located <!-- lean-ctx --> markers via substring
    search, so a documentation sentence like (see the `<!-- lean-ctx -->` block below) anchored the block replacement at the prose mention and
    silently deleted everything down to the real end marker — live-reproduced
    on this repo's own AGENTS.md, where a session-start heal wiped ~75 lines
    (Development Workflow, Session Continuity, Provider Pipeline, Quality Bar).
    Markers now match only as whole (trimmed) lines — the exact shape every
    writer emits — and the end marker is searched strictly after the start
    line, so stray end markers above the block can't create bogus spans.
    All upsert/replace/remove trigger checks (hooks/mod.rs,
    hooks/support.rs, rules_dedup) use the same line-based predicate;
    prose mentions are now invisible to the block machinery. Regression tests
    cover the exact live-repro shape.

Added

  • Anchored editing end-to-end — ctx_patch becomes the first-class edit path
    (#1008, "Edit Loop v1").
    The anchored editor now closes the loop the rules
    already routed: read with ctx_read(mode="anchored") (or tag hits via
    ctx_search(anchored=true)), then patch by line + hash anchor — the agent
    never reproduces old text byte-for-byte, saving output tokens (~5x input cost)
    on every edit.
    • Advertised where it earns its tokens: ctx_patch joins the lazy core
      and the standard profile (now 16 tools). Client-aware quirks keep the
      default surface lean — clients with a reliable native editor (Cursor, Zed,
      Windsurf, Antigravity, OpenCode) skip it and pay zero extra schema tokens;
      Claude Code, CodeBuddy, pi/SDK and headless clients get it. Pinned profiles
      are client-agnostic and always include it.
    • Schema diet: the advertised ctx_patch schema shrank ~625 → ~263
      tokens; rarely-used params (expected_md5, backup, validate_syntax,
      evidence) stay supported but are no longer advertised.
    • op=create: ctx_patch can create new files (strictly new — existing
      files are refused; not mixable with anchored ops in one batch), so MCP-only
      harnesses get the complete edit story from one tool.
    • Guidance coherence: Claude/CodeBuddy pointer blocks (v5/v3, keeping the
      MCP-aware guard semantics of v4/v2), agent templates, skills and per-editor
      guides now teach anchored-editing-first; ctx_edit (str_replace) is
      documented as the legacy power-profile fallback. New troubleshooting FAQ:
      "Where did ctx_edit go?".
    • Edit-efficiency metering (honest, #361-style): a separate metric
      channel measures the anchored-editing claim per applied op —
      tokens(replaced span) − tokens(anchor args), i.e. output the model did
      not re-emit — plus stale-anchor CONFLICT retries, against the
      str_replace baseline (old_string tokens paid, old_string misses).
      Never estimated, never folded into the read-gain ledger, never printed in
      tool bodies (#498). Surfaced in ctx_metrics, /api/stats → edit_efficiency and a dashboard ROI "Edit Efficiency" card
      (~/.lean-ctx/edit_metering.json). Contract:
      docs/contracts/edit-metering-v1.md.
    • A/B benchmark, reliability + cost: the hermetic edit_reliability
      suite fixes identical mechanical bugs across 5 languages with both tools —
      anchored 10/10 vs minimal str_replace 5/10 (recovering to 10/10 only by
      paying extra recalled context), and ~41% fewer argument output tokens on
      identical successful fixes (tiny-span exceptions reported honestly).
  • Hook-aware Cursor guidance — the honest profile (GL #1153–#1157). On
    hosts whose installed lean-ctx hooks already compress the native tools
    (Cursor: PreToolUse rewrite covers Shell, redirect covers Read/Grep),
    the injected ~/.cursor/rules/lean-ctx.mdc now carries a new
    HookCovered profile instead of the full mapping: it states that native
    Shell/Read/Grep are compressed transparently (using them is fine) and
    advertises only the capabilities with no native equivalent (ctx_compose,
    ctx_symbol/ctx_callgraph, ctx_semantic_search,
    ctx_knowledge/ctx_session, ctx_expand). Rationale: Cursor's harness
    makes native tools first-class, so a "NEVER use native" rule there is
    unenforceable and only produces instruction dissonance — the model follows
    neither rulebook consistently. The MCP initialize anchor for covered
    Cursor sessions is reworded the same way. Detection is conservative
    (both PreToolUse entries must be present; invalid/missing hooks.json
    falls back to the full Dedicated mapping), the byte-exact drift check
    re-syncs the profile when hooks are installed or removed later, and the
    Cursor hook installer now honours shadow_mode/compression_level
    instead of hardcoding them (GL #1156). ~55% smaller Cursor rules payload
    on hook-covered installs, billed every session.
  • Guard-safe re-read dedup for Claude Code / CodeBuddy (GL #1140, follow-up
    to #637).
    read_redirect = auto keeps the read-before-write guard intact
    by letting native Read run on the real path — which also forfeited the Read
    dedup savings on those hosts. A new PostToolUse hook (lean-ctx hook read-dedup, matcher Read only) wins them back without touching the guard:
    the result of a re-read of an unchanged, already-read file is replaced with
    a compact [unchanged] stub via the documented updatedToolOutput channel.
    First reads stay byte-identical (edit safety: old_string always comes from
    real content), the incoming response shape is mirrored with only the content
    field swapped (unknown shapes pass through), every failure path fails open,
    replacement happens only when strictly smaller, a host compaction
    (PreCompact) purges the session's records so post-compaction re-reads
    deliver full content again, and Cursor's double-fired hooks are recognised by
    tool_use_id so a duplicate first read is never mistaken for a re-read.
    Config read_dedup = auto | on | off (env LEAN_CTX_READ_DEDUP); auto
    (default) activates only on guard hosts, where the PreToolUse redirect is
    off. Verified end-to-end against headless claude -p 2.1.139: first read
    byte-identical, second read served as the ~40-token stub, native Edit of the
    same file still passes the read-before-write gate.
  • Hybrid multi-repo search (Context Hub, GL#1133). ctx_multi_repo action=search now runs the full hybrid stack per root — BM25 + dense
    embeddings + SPLADE boost + graph ranks, the same pipeline as single-root
    semantic search — and fuses the per-root rankings with RRF (identical key and
    score semantics as before, so fusion behavior is unchanged; only the per-root
    signal got stronger). A root with a cold dense index degrades to its BM25
    ranking with a warning instead of failing or inline-embedding under the query
    (#512 semantics). mode="bm25" forces the legacy lexical-only path,
    byte-identical to the previous output.

Changed

  • Benchmark numbers refreshed & self-footprint made a headline metric (#659).
    BENCHMARKS.md regenerated with v3.8.18 (map 98.1% / signatures 96.7% on the
    50-file corpus; cold start 2.69s → 0.67s). The README benchmarks section now
    also states lean-ctx's own fixed per-session cost (~2.1K tok, CI-gated via
    doctor overhead --gate) and links the deterministic dual-arm self-verify
    (digest f5ed145e61ce3689) with its methodology. The CGB self-assessment
    (C2 — Managed) is surfaced from the README security section and Journey 13.

Fixed

  • dev-install honours redirected cargo target dirs (GH #671). Both
    rust/dev-install.sh and the lean-ctx dev-install command located the
    built binary at a hardcoded target/release/…; with CARGO_TARGET_DIR or a
    ~/.cargo/config.toml [build] target-dir override (one shared build cache
    across worktrees) they silently symlinked/installed a stale or missing
    binary. The target dir is now resolved via cargo metadata (env, config
    files and workspace settings all honoured) with a ./target fallback, the
    shell script fails loudly when the binary is absent instead of planting a
    dead symlink on PATH, the Rust path gained the same resolution plus the
    Windows .exe suffix, and tests/pre_release_check.sh follows suit.
    Follow-up: install.sh's source-build path (served at
    leanctx.com/install.sh) had the same hardcode and could link a stale
    binary from an earlier default-layout build — it now resolves via
    cargo metadata identically and names the override in its error hint.
    Thanks @getappz for the report and the initial
    fix (#672)!
  • pi-lean-ctx ships with zero runtime npm dependencies (GH #670). pi
    installs every package into one shared npm prefix and re-reifies the whole
    tree on each pi install/pi remove; an interrupted rewrite (Windows
    AV/file locks) stranded zod/v3/locales/en.js and the extension failed to
    load — unrepairable by reinstalling, because npm never re-extracts a package
    whose version matches. The MCP SDK (incl. zod) is now vendored as one
    self-contained bundle (extensions/vendor/mcp-sdk.cjs, built at prepack),
    so no corruptible dependency tree exists in the first place. Verified by an
    isolation smoke: bundle in an empty dir, real initialize + tools/list
    roundtrip, plus a jiti-loaded co-install with pi-markdown-preview.
  • MCP server answers initialize before doing housekeeping (GH #669).
    Orphan-process sweep (one ps per running lean-ctx), proxy autostart (TCP
    probe + detached spawn) and the throttled savings-recap publish ran in front
    of the stdio transport bind — on a cold WSL2 / VS Code Server start this
    widened the window in which VS Code's start-on-demand first tool call races
    server readiness and dies with Cannot read properties of undefined (reading 'invoke') (upstream: microsoft/vscode#321150). That work is now
    deferred onto the blocking pool, concurrent with the handshake; a
    time_to_initialize_ms log line makes the span measurable, lean-ctx doctor surfaces the upstream race on WSL2 + VS Code setups, and a
    regression test drives the exact race pattern (tools/call immediately after
    the initialized notification) against the real binary.
  • Zero-config golden path: onboard --yes now leaves doctor fully green.
    Three healers that silently disagreed are aligned: the session-start heal
    installs the agent SKILL.md files alongside rules (previously doctor
    warned "run: lean-ctx setup" forever), doctor's shell-hook probe honours a
    relocated LEAN_CTX_CONFIG_DIR (no more false "pipe guard missing"), and
    setup/onboard detect Claude Code / CodeBuddy by their state dir
    (~/.claude/, ~/.codebuddy/) exactly like doctor and the rules injector
    do — killing the dead loop where doctor pointed at setup but setup
    skipped the client. A new integration gate (onboard_doctor_clean) runs the
    full journey in an isolated HOME and asserts doctor exits green.
  • ctx_knowledge remember never stalls on the embedding model again. The
    first remember on a fresh install used to block up to the 120s tool
    watchdog while the ~30MB embedding model downloaded. It now uses non-blocking
    engine access: the fact commits immediately, the engine warms up in the
    background.
  • Semantic recall self-heals missing vectors. Facts written by the
    consolidation/ETL writers (and by remember while the engine is still
    warming up) never got an embedding, and only a manual embeddings_reindex
    repaired that — on a live machine most projects sat at 0 vectors, invisible
    to mode=semantic recall. remember now backfills up to 32 missing vectors
    per call (one batched inference, most-valuable-first, under the per-project
    lock), so active projects converge to full coverage without any manual step.
  • minimal_overhead=true (the default) is now documented honestly: session
    continuity is delivered via the AUTO CONTEXT block on the first tool call
    (prompt-cache-friendly) instead of an ACTIVE SESSION block at initialize.
  • CLAUDE.md block v4: MCP-aware guidance (GL #1138, second half of #637).
    The injected CLAUDE.md/CODEBUDDY.md block recommended ctx_read-first and a
    ctx_edit fallback unconditionally — in sessions without a connected
    lean-ctx MCP server those tools do not exist, stranding agents on shell
    heredocs. The block (v4 / CodeBuddy v2, session-heal updates existing
    installs) now scopes every ctx_* recommendation to "when the ctx_* MCP tools
    are listed in this session", documents native ReadEdit as the primary
    editing path under the read-before-write gate, and says explicitly to use
    native tools throughout when no ctx_* tools are available. doctor gains an
    Instructions/MCP consistency check (GL #1139) that flags the hazardous
    combination — instructions advertising ctx_* while no lean-ctx entry is
    registered in the Claude MCP config — with a lean-ctx setup repair hint.

Security

  • ctx_call can no longer bypass egress DLP or permission inheritance. The
    guarded dispatch path unwraps ctx_call(name=…, args=…) and runs both checks
    against the inner tool and its arguments (nested ctx_call is already
    refused by the handler). Egress payload extraction is centralized in one
    helper shared by the MCP server and lean-ctx policy enforce, and now also
    covers ctx_patch write bodies (new_text, new_body, ops[].new_text).
    prefer_native_editor (#454) now hides/refuses ctx_patch alongside
    ctx_edit.
  • Bundled addons now spawn with a scrubbed environment (addon env isolation).
    Every runnable registry addon (Headroom, Sophon, Repomix, Serena, …) now
    declares a [capabilities] block. Its mere presence flips the single gateway
    spawn point from the legacy "inherit the full host environment" path to the
    scrubbed path (env_clear + base allowlist), so host API keys/tokens no longer
    reach an untrusted addon child process. Network/filesystem grants are declared
    honestly to match each tool's real needs (registry fetch, cache/index/vault
    writes) — the empty env allowlist is the isolation win. A regression test
    asserts every runnable bundled addon carries a capability block.

Added

  • Doc corpora as first-class retrieval sources (Context Hub, GL#1132). The
    artifact index now ingests PDF (panic-safe local text extraction; a
    scanned or malformed PDF becomes a warning, not a failed build), and the
    artifact registry (.lean-ctx-artifacts.json) accepts absolute/~ paths
    so external doc folders — an Obsidian vault, ~/notes — become searchable
    corpora. PathJail stays the gate: external entries resolve only when
    allow-listed (read_only_roots / extra_roots / LEAN_CTX_ALLOW_PATH); a
    leading slash that matches an existing project path keeps its legacy
    project-relative meaning. New CLI flag semantic-search --artifacts searches
    the doc corpus; new guide docs/guides/docs-sources.md. Determinism guard:
    re-indexing an unchanged corpus is byte-identical (#498).
  • pgvector dense backend (Context Hub, GL#1136). Teams that already operate
    PostgreSQL can point the dense half of hybrid retrieval at it:
    LEANCTX_PGVECTOR_URL=postgres://… (or LEANCTX_DENSE_BACKEND=pgvector)
    stores embeddings in per-project, per-dimension vector(N) tables — same
    namespacing, point-id scheme and delete-by-file incremental sync as the
    qdrant backend, so switching backends never mixes identities. Implemented
    through the psql CLI (zero new crate dependencies, mirrors the postgres
    provider); rows return as per-line JSON for robust parsing; identifiers and
    literals are strictly validated/escaped. The qdrant + pgvector features
    join the default feature set, so release binaries support all three backends
    out of the box; a live end-to-end test (pgvector_e2e_round_trip, --ignored)
    verifies table creation, cosine search, incremental replace and quote-escaping
    against a real pgvector container. Guide: docs/guides/dense-backends.md.
  • Addon registry: qmd + memgraph-ingester (Context Hub, GL#1134). Two
    community tools from the Discord retrieval thread are now 1-command installs:
    qmd (on-device Markdown/notes search — BM25 + vectors + reranking, via
    npx -y @tobilu/qmd@2.5.3 mcp) and memgraph-ingester (structure-aware RAG
    on a Memgraph code graph, via uvx memgraph-ingester-mcp==0.6.6; needs a
    running Memgraph). Both ship scrubbed-env capability blocks; the memgraph
    Bolt-URI/read-only toggles joined the reviewed env passthrough allowlist.
  • Docs: the context-infrastructure map (GL#1135). New
    docs/guides/context-infrastructure.md (sources → one pipeline → hybrid
    retrieval → OKF/ctxpkg portability → addons) and
    docs/guides/dense-backends.md documenting the previously undocumented
    Qdrant dense backend (LEANCTX_DENSE_BACKEND, LEANCTX_QDRANT_URL/_API_KEY
    /_TIMEOUT_SECS/_COLLECTION_PREFIX) next to the default in-process store.
  • Portable OKF knowledge export/import (knowledge export --format okf /
    knowledge import <dir>, ctx_knowledge). Renders facts, patterns and typed
    relations from one shared KnowledgeSnapshot to the vendor-neutral Open
    Knowledge Format (git-diffable Markdown + YAML, relations as Markdown links) or
    the signed .ctxpkg bundle. Round-trips byte-identically, accepts foreign OKF
    bundles, and never leaves dangling relations. Fully local and free.
  • Addon registry version-staleness check (scripts/check-addon-versions.py).
    Resolves every pinned upstream (PyPI / npm / NuGet / crates.io) against its
    registry and reports drift as GitHub annotations. Wired into a dedicated,
    non-blocking Addon Registry Freshness workflow (weekly + whenever the registry
    changes) so a curated pin is never silently stale — and an upstream release
    never breaks our own build.
  • Cognee is now 1-click installable (addon add cognee). It ships a published
    MCP package (cognee-mcp) and runs fully local by default (SQLite + LanceDB +
    Kuzu), so it fits the standard uv tool install bootstrap; the only runtime
    requirement is an LLM_API_KEY, which is passed through via a reviewed
    single-entry capability allowlist (all other host env stays scrubbed). The
    remaining memory/graph listings (mem0, graphiti, zep, letta, claude-context)
    stay directory-only because they need external infrastructure (a vector/graph
    DB, or a managed account) that a one-command install cannot provision.
  • session new aliases session reset (#653). lean-ctx session new now clears
    the active session just like session reset, matching the "start a new session"
    mental model; covered by a CLI characterization test.
  • Deterministic markdown compaction + progress-log folding in aggressive
    compression (#655).
    .md/.markdown/.mdown reads (and .txt files that
    carry a real ATX heading) are structurally compacted: every heading survives,
    fenced code blocks are atomic (kept verbatim or dropped whole, never split by
    an omission marker), and body lines are ranked by an IDF-style scorer over
    ordered token sets so the output is byte-stable (#498). Shell compression now
    folds repetitive cargo/pytest/package-manager progress runs into stable
    markers while still honoring the verbatim token cap — diagnostics stay
    verbatim, oversized logs keep safety-needle preservation. Thanks @ousatov-ua!

Changed

  • RMCP SDK upgraded 1.7 → 2.0 (MCP 2025-11-25 alignment, #656). The MCP
    server/client stack now builds on rmcp 2.0: Content is the spec-unified
    ContentBlock, prompt roles use the shared Role, resources are plain
    Resource structs, and progress notifications use the new constructor API.
    Pulls in rmcp 2.0's security fixes (OAuth resource-spoofing/metadata-SSRF
    hardening, streamable-HTTP session-leak fix) and unlocks 2025-11-25 protocol
    features (tool icons, URL-mode elicitation, tasks) for future releases.
    Protocol negotiation with older clients (2025-06-18 and earlier) is
    unchanged — verified end-to-end over stdio against the 1.7 baseline (identical
    tool surface, identical negotiated protocol). Client-facing roots-based
    project-root auto-detection stays in place (SEP-2577 deprecation
    acknowledged upstream, still fully functional).
  • Refreshed bundled addon pins to current upstream: Headroom 0.27.0 → 0.28.0,
    Repomix 1.15.0 → 1.16.0.

Fixed

  • Zero-config first-session frictions closed after a fresh-install E2E audit
    (#658).
    A scripted fresh journey (isolated $HOME, real MCP handshake like
    an editor) surfaced eight frictions; all are fixed with regression tests:
    auto-findings now parse the pre-decoration tool output, so the injected
    --- AUTO CONTEXT --- header can no longer become a junk Read --- finding
    polluting session memory and every wakeup briefing (F1); setup/onboard
    --help prints help instead of executing setup side effects (F2);
    ctx_call with misspelled keys (tool/args/params) fails with the
    exact fix instead of silently dispatching without arguments (F3);
    ctx_knowledge remember derives a deterministic key slug when key is
    omitted and accepts content= as value alias — matching what our own
    injected instructions document (F4); Rust call edges inside macro bodies
    (println!, assert_eq!, …) are extracted at the token level, so a fresh
    Rust project no longer reports 0 edges (F5); the project-overview header
    surfaces persisted call-graph edges instead of contradicting ctx_callgraph
    with 0 edges (F6); bare ctx_knowledge recall lists recent facts instead
    of erroring (F7); and ctx_session show is accepted as a synonym of
    status (F8).
  • MCP PathJail auto-corrects a stale markerless root instead of rejecting the
    workspace (#649).
    An MCP server launched by VS Code/WSL could adopt a
    markerless client cwd (e.g. /mnt/c/Users/<user>) as its jail root; the first
    absolute path into the real workspace on another mount was then rejected with
    path escapes project root, breaking ctx_compose / ctx_read / ctx_patch.
    resolve_path now reroots opt-in-free from such a markerless root to the
    marker-bearing project derived from the requested path — the same rationale as
    the agent-config-dir case (#580) — while a markerless target with no derivable
    project stays blocked, so PathJail enforcement is unchanged.
  • Local daemon IPC no longer 401s on tool calls (#651, #652). The daemon writes
    an auto-generated auth token, but the IPC client (Unix domain socket / Windows
    named pipe) sends no Authorization header, so /v1/tools/call failed with 401
    while /health passed. Router construction is now split: TCP HTTP keeps Bearer
    auth, while local IPC serving disables the HTTP Bearer — the socket/pipe is already
    a user-local OS boundary (Unix 0o600, user-specific pipe name). TCP auth is
    unchanged, a regression test guards the IPC path, and a security review found no
    weakening of network auth.
  • Codex stops reconstructing compressed shell output in chunks (#625, #654). The
    SessionStart hint now states plainly that compressed output is not exact evidence
    and hard-requires re-running lean-ctx raw "<exact command>" for exact content,
    forbidding chunked reconstruction (cat/sed/head/tail) and quoting
    compressed output as exact — so Codex uses the reversible raw escape instead of
    re-reading the compressed view piecemeal.
  • Enterprise/OS TLS roots are honored by every HTTP client (#643). All ureq
    clients are now built through core::http_client, which injects
    RootCerts::PlatformVerifier so requests trust the system/enterprise trust store
    instead of only the bundled WebPKI roots — fixing UnknownIssuer failures behind
    TLS-intercepting corporate proxies (updates, version check, embeddings download,
    Qdrant, Datadog/FinOps export, LLM enhance, SSO/billing, web fetch, webhooks).
  • Shell hook is quiet by default (#646). The activation notice (lean-ctx: ON …)
    no longer prints on every new interactive terminal; mode-change notices now route
    through a _lean_ctx_notice helper that speaks only when LEAN_CTX_DEBUG=1 (and
    stdout is a TTY). lean-ctx-status still reports the current state on demand.
  • doctor recognizes its own running dashboard on port 3333 (#644). The
    dashboard port check reported a conflict whenever port 3333 was busy — even when
    the occupant was lean-ctx's own dashboard. It now probes /api/version on bind
    failure and reads the port as healthy only when the response is the dashboard's
    own version JSON; unrelated services still surface the conflict. Implemented by
    strengthening and reusing the dashboard's existing dashboard_responding probe,
    so the browser-open guard and doctor share one source of truth.
  • Native Read no longer breaks Claude Code's read-before-write guard (#637).
    The PreToolUse redirect hook rewrote a native Read to a temp .lctx copy, so
    Claude Code's Write/Edit read-before-write guard tracked the temp path and a
    follow-up native Write/Edit to the real file failed with "File has not been read
    yet" — worst in headless claude -p, with no supported off-switch (the hook
    self-healed back into settings.json). A new read_redirect = auto | on | off
    key (env LEAN_CTX_READ_REDIRECT) now governs the Read redirect and is evaluated
    per hook fire, so it also covers headless runs and never fights the self-heal. The
    default auto disables only the Read path-swap on hosts carrying that guard —
    Claude Code / CodeBuddy, detected inside the hook via the CLAUDE_PROJECT_DIR
    marker Claude Code exports to every hook subprocess (CLAUDECODE / CODEBUDDY are
    honored too) — so native Read → Write/Edit works out of the box; the ctx_read MCP
    tool and the Grep/Glob redirects keep compressing. on restores always-redirect;
    off disables the Read redirect everywhere.

Upgrade

lean-ctx update                 # recommended (auto-downloads + refreshes shell hooks)
cargo install lean-ctx          # or
npm update -g lean-ctx-bin      # or
brew upgrade lean-ctx

Note: After upgrading via cargo/npm/brew, run lean-ctx setup to refresh shell aliases. lean-ctx update does this automatically.

Full Changelog: v3.9.0...v3.9.0

Don't miss a new lean-ctx release

NewReleases is sending notifications on new releases.