Changed
- The shell hook is now transparent in plain human terminals: default
activation isagents-only(GH #699). With the oldalwaysdefault the
hook aliased git/docker/kubectl in every interactive shell — so a human in
a plain terminal (no agent anywhere) saw lean-ctx allowlist diagnostics for
their own commands. lean-ctx exists to save agent tokens; the aliases now
auto-activate only when an agent session is detected (LEAN_CTX_AGENT,
CURSOR_AGENT— newly recognized across every guard —CLAUDECODE,
CODEBUDDY,CODEX_CLI_SESSION,GEMINI_SESSION). Set
shell_activation = "always"(orLEAN_CTX_SHELL_ACTIVATION=always) to
keep the old behavior, e.g. to feed your own shell usage into
lean-ctx wrapped;lean-ctx-onstill opts a single session in manually.
The "[CLI] Command would be blocked in MCP mode" allowlist diagnostic is
also downgraded to debug level for interactive TTY callers — it's agent
telemetry, not human feedback. Thanks @DerPate for the precise report.
Added
/v1/compressis wire-compatible with LiteLLM's prompt-compression
guardrail (GH #700). LiteLLM ≥ v1.92 can call a compression sidecar during
pre_call(guardrail: headroom); the response now carries the
tokens_before/tokens_after/compression_ratiotelemetry fields that
guardrail logs, alongside the existing richerstatsblock. Point the
guardrail'sapi_baseat the lean-ctx daemon and every request through a
LiteLLM gateway is compressed deterministically (prompt-cache-safe, #498) —
no client change, including Claude Code viaANTHROPIC_BASE_URL. Cookbook:
docs/guides/compress-sdk.md.- Provider-verified savings receipts (GH #701, opt-in
proxy.counterfactual_metering). Wire savings were estimated (bytes/4 or
local tokenizer). With metering on, every request the proxy rewrites also
fires Anthropic's freecount_tokensendpoint with the original,
uncompressed body — concurrently with the real forward, spawned detached so
it can never delay, mutate or fail the request — and pairs the
provider-counted "would have billed N" with the same response's actually
billed usage. Same request, same moment: no traffic-mix confound
(methodology adopted from pxpipe's counterfactual metering)./statusgains
averified_savingsblock andlean-ctx proxy statusaVerified:line
beside the estimate; per-model pairs persist across restarts in
proxy_usage.json(pre-#701 files load unchanged). Net-negative results are
reported signed, never clamped. Anthropic only (no free counting endpoint
elsewhere); probe failures silently degrade the row to the estimate. - CCR round-trips through LiteLLM's agentic loop (GH #702). A lossy
/v1/compressrewrite now advertises its retrieval hash in the guardrail's
regex-lockedhash=<24-hex>form, and the newGET /v1/retrieve/{hash}
endpoint resolves it from the content-addressed tee store
({"original_content": …}). LiteLLM (BerriAI/litellm#31681) injects its
retrieve tool on seeing the marker, validates the hash per call id, and
replays the model with the verbatim original — compression behind a LiteLLM
gateway is reversible end-to-end, with zero lean-ctx-specific client code.
The marker shape is pinned by a contract test so drift fails CI; the hash is
a pure function of the content, so stubs stay byte-stable (#498). The
existing local handles (<lc_expand:…>, tee paths,/v1/references/{id})
are unchanged. - Persistent per-extension grammar telemetry (GH #690 Phase 2 groundwork).
The tiering cut needs to know which of the ~27 static tree-sitter grammars
actually earn their binary bytes, but the only signal was a pair of
process-lifetime counters with no language dimension (flagged by @getappz).
core/grammar_usagenow records tree-sitter vs regex-fallback hits per file
extension, persisted across sessions ingrammar_usage.json(aggregate
counters only — no paths or project data).ctx_metricsshows the all-time
top extensions in its SIGNATURE BACKEND section.
Fixed
- Multi-window MCP starts can no longer trip the crash-loop backoff
(GH #694 follow-up — thanks @ITFinesse). The crash-loop guard counts
server starts in a 60s window, but a healthy burst — N editor windows each
spawning a server, plus the client's own retries while a slow host
initializes — could cross the threshold with zero crashes. The resulting
pre-handshake backoff sleep (up to 30s) then caused the very
"Waiting for server to respond toinitializerequest" timeouts it exists
to prevent, wedging the second window. A completed MCP handshake now clears
the start history (a handshake proves binary + config are healthy; true
crash loops die before it), so only genuinely crashing servers back off. - VS Code Insiders is now a first-class MCP target (GH #694 follow-up —
thanks @ITFinesse). Insiders keeps a fully separate profile dir
(Code - Insiders/User), so registering lean-ctx in stable's
Code/User/mcp.jsonleft Insiders with an emptyMCP: Open User Configuration— exactly the "server missing in one window" confusion from
the multi-window report.setup/initnow detect and write the dedicated
Insiders config on all platforms (agent keyvscode-insiders),doctor
lists it as its own MCP location, and uninstall cleans it up. - Grammar-addon dylibs refuse to load from world-writable dirs/files
(GH #690 review point 3, PR #697 — thanks @getappz). A group/other-
writable grammar dir would let any local account swap the dylib between
hash check anddlopen; the loader now rejects that layout outright. ctx_readgainsrepoparam parity in multi-repo mode (GH #696,
PR #698 — thanks @getappz).ctx_search/ctx_glob/ctx_treecould
already target a registered root viarepo=<alias>, butctx_readcould
not — you could find a file in another root yet not read it. Read-only by
design (ctx_edit/ctx_patchstay session-rooted until undo history is
multi-repo-aware); unknown aliases error with the list of known ones, and
jail + secret screening apply against the resolved repo root.- A corrupt
stats.jsonis quarantined, never silently reset (GH #706 —
thanks @getappz). A crash mid-write (or disk-full) could leave truncated
JSON; the loader'sunwrap_or_default()then wiped months of savings
history without a trace on the next write. Unparseable stats now move to
stats.json.corrupt(one warning log; the file is evidence and stays
recoverable by hand), anddoctorreports the quarantine with recovery
guidance instead of everyone silently starting from zero. - Relative paths follow a mid-session worktree switch (GH #707 — thanks
@getappz).project_rootis captured once at MCPinitialize; when the
client later enters a git worktree (Claude CodeEnterWorktreenests a
full checkout under.claude/worktrees/<n>/), every relative path kept
resolving into the stale root — silently, because the same layout exists
in both trees. Resolution now walks bothshell_cwdandproject_rootup
to their nearest.gitentry (dir or worktree file); when the boundaries
differ, the liveshell_cwdwins. A plaincd rust/inside the same
checkout shares the boundary and is untouched, and ashell_cwdwith no
git upward gives no signal — so the monorepo behavior stays exactly as
before. ctx_readraw mode no longer swallows markdown table delimiters
(GH #709 — thanks @getappz). The output sanitizer's symbol-flood guard
(meant for degenerate model output like@@@@@@…) also matched legitimate
document structure —|----|----|delimiter rows,====/----setext
underlines and HR lines vanished from raw reads, breaking the mode's
byte-fidelity contract. Structural characters no longer count toward the
flood check, and a removed flood line no longer eats the file's trailing
newline.ctx_shell's explicitcwdparam now updates the live shell cwd
(GH #707 follow-up). The worktree-divergence detection reads
session.shell_cwd, but that field only trackedcdcommands inside
command text — clients that switch checkouts pass the new directory as the
cwdargument of every call, so the switch was invisible to path
resolution. A jail-accepted explicitcwdis now persisted, verified
end-to-end over a real MCP session (read resolves into the worktree copy
afterctx_shell cwd=<worktree>).lean-ctx stop/dev-installno longer SIGTERM their own process tree
(GH #714). Run under the lean-ctx shell wrapper (lean-ctx -c … → sh → lean-ctx dev-install), the process sweep matched the wrapper parent and
killed the pipeline mid-install (exit 143) — after the binary swap but
before autostart was re-enabled. The sweep now excludes the full
ps ppidancestor chain and every member of its own foreground process
group — agent harnesses (Cursor's shell) reparent intermediaries to PID 1
mid-run, which broke the ancestor walk alone; the group covers the wrapper
regardless of reparenting. Verified:dev-installunder the Cursor agent
shell now completes end-to-end, including autostart re-enable.- Unknown MCP tool names now suggest the nearest registered tool
(GH #712 — thanks @getappz).ctx_serachreturned a bare "Unknown tool"
while the CLI has long offered "did you mean" for typos; the
Levenshtein suggester is now shared (core::levenshtein) and the MCP
dispatch error appends "— did you mean 'ctx_search'?" within a
length-scaled edit budget, so agents self-correct in one turn instead of
falling back to native tools.
Added
- Portable hook binary for synced agent configs (GH #708,
hook_binary/LEAN_CTX_HOOK_BINARY). Generated hook commands bake
the machine-absolute binary path (#367: agent hosts run hooks without your
PATH). If you sync~/.claude/settings.jsonbetween machines with
different usernames, that absolute path is wrong on every other machine —
and re-runninginit/doctor --fixthere rewrites the file, ping-ponging
your sync forever. Settinghook_binary = "$HOME/.local/bin/lean-ctx"
(config) orLEAN_CTX_HOOK_BINARY(env) emits that expression verbatim
into every shell-executed hook command — the hook host's shell expands it
at run time — anddoctoraccepts it as current, ending the rewrite
cycle. MCP server registrations and launchd/systemd autostart units keep
the real absolute path: nothing expands variables there. - The AI Gateway (team mode). The engine can now run as a shared
org gateway — one deployment your whole team points its IDEs at, with
per-person attribution, governance and audited savings. Compiled into the
default binary (gateway-serverfeature), local-free invariant intact:
nothing changes for solo use until you run it.lean-ctx gateway serve— multi-provider reverse proxy
(Anthropic / OpenAI / Gemini / Ollama / custom registry) with per-person
bearer keys, usage metering to Postgres (usage_events), wire-shape
translation (an Anthropic-speaking IDE can call an OpenAI-hosted model and
vice versa) and a token-protected admin console on a separate port.lean-ctx gateway init— plug-and-play scaffold: docker compose,
.env, key file and a step-by-step README in one command;
gateway doctorpreflights config, secrets, DB and ports.lean-ctx gateway keys add|list|rotate|revoke— key lifecycle
without storing plaintext (SHA-256 hashes only, shown once).
rotate(GL enterprise#67) replaces every key of a person in one atomic
file swap — no window where the person has zero valid keys — and keeps
team/project attribution.GET /v1/models(GL enterprise#63) — the curated org model catalog
from[proxy.routing.aliases], content-negotiated: OpenAI-shape and
Anthropic-shape clients each get their native list format. IDEs discover
org names likezuehlke/fast; the gateway resolves the alias, injects
upstream credentials and stampsrouted_frominto the ledger./mepersonal usage view (GL enterprise#64/#65) — each person signs
in with their own gateway key and sees exactly their spend, savings,
trend, models and projects — never anyone else's. Dark/light, 24h–90d
windows, savings-share KPI.- Signed org-policy gates (GL enterprise#25/#66) — under a signed,
pinned,enforced = trueorg policy the forward path refuses:
models outside the[routing].allowed_modelsceiling (403), spend above
[budgets]caps per person/UTC-day or project/UTC-month (429), and — new —
requests beyond[budgets].max_requests_per_minute_per_person(429 with
an honestRetry-Afterof the seconds until the minute rolls). Errors
arrive in the caller's wire shape; refusals are counted on
leanctx_policy_blocked_total{reason="model_ceiling"|"budget"|"rate_limit"}.
Without an enforced org policy every gate is a no-op. - Evidence & GDPR (GL enterprise#36/#39) — usage retention windows,
Ed25519-signed evidence exports (gateway evidence/evidence verify),
person-scopedgateway gdpr export|delete, and Blake3 pseudonymization
for person identifiers at rest.
- Multi-window visibility (GH #694).
lean-ctx doctorno longer claims
"no active session" when sessions exist for other workspaces: run from a
directory that isn't an open project root it now reports
none for this directory — recent: frontend (4m ago), backend (1h ago),
naming every workspace with a live session. The dashboard overview gains a
"Connected workspaces" panel (new/api/workspacesendpoint) listing each
project with status (active <10 min, idle <24 h, stale), last activity,
tokens saved and current task — shown as soon as two or more workspaces
have sessions.
Added
- Grammar addons: long-tail tree-sitter grammars as signed runtime dylibs
(GH #690 Phase 1, PR #695 — thanks @getappz). Structural understanding no
longer has to be compiled in: an extension not covered by the 27 built-in
grammars can now resolve through a SHA-256-pinned, per-platform grammar
dylib that isdlopen'd at runtime — manifest + curated registry
(data/grammar_registry.json, user-overridable under the same signed-
override policy as the addon registry), a loader that verifies the hash pin
on every load plus the tree-sitter ABI version before handing the
grammar to the parser, a five-platform CI build matrix, and a zero-config
fetch on first use. Fully offline-safe: no addon installed (or no network,
oraddons.policy = locked, or the newaddons.grammar_auto_fetch = false
for strict-egress orgs) degrades to the regex-signature fallback exactly as
before. Installed dylibs land read-only and ad-hoc-signed on macOS; every
fetch is logged with its source URL. The registry ships empty — which of
the 27 static grammars (if any) move to the addon tier is a separate,
telemetry-gated Phase 2 decision.
Changed
- The heredoc-to-interpreter refusal now hands the agent the recovery path
(GL #1161). Policy review outcome: the block stays — inline code embedded
in the command string never exists as an inspectable artifact, while a
script file passes the write path's own guards and leaves an audit trail.
But the old message ("Use a script file instead") left agents rediscovering
the workaround by trial and error; the refusal now spells it out: write the
code to a file (Write/ctx_edit), thenpython3 /tmp/snippet.
Fixed
- A transient
roots/listfailure no longer disables project-root detection
for the whole MCP session (GH #694). The first tool call resolves client
roots exactly once; when that single attempt failed (e.g. the IDE window was
still starting up — the VS Code second-window pattern), the server never
asked again and fell back to cwd guessing for the session's lifetime. Failed
attempts now re-arm resolution for up to 3 tries; a-32601 Method not found
(client declares the capability but doesn't implement it — Cursor) still
gives up immediately, androots/list_changedrestores the retry budget. dev-installon Windows no longer hard-fails withACCESS_DENIEDwhile an
IDE holds the old binary open (GH #691). The final swap did a bare
replace-rename, which Windows refuses for as long as any process runs the old
image — and dev-install deliberately never kills the IDE-owned MCP server
(#1036), so no retry budget could ever succeed (measured: identical failure
after 60 s). The install now uses the rustup-style sidecar swap: the running
binary is renamed aside tolean-ctx.old.exe(allowed for mapped images),
the fresh binary lands at the real path, and the sidecar is reclaimed on the
next install once its holder exited. If even the rename-aside is blocked
(AV/EDR-style zero-sharing lock), the error now explains the cause and the
fix instead of a bare OS error code. Thanks @getappz for the measurement
work in #691/#692.ctx_sharehandovers with org agent ids (team:alice) are now pullable on
Windows. The share filename embedded the agent id verbatim; NTFS interprets
:as an Alternate Data Stream, so the write "succeeded" but the file never
appeared in the store — the receiving agent saw "No shared contexts for you".
Filenames now use a filesystem-safe slug ([A-Za-z0-9._-], everything else
-); the true agent id still lives inside the JSON payload.- Background knowledge writers can no longer clobber facts a parallel
rememberjust committed (lost-update, #326 class). The consolidation
pipeline (apply_artifacts_to_stores) and the gateway memory adapter
(addon_memoryingest) both did load → modify → blindsave()from a
background thread; a fact committed between their load and save was silently
dropped — surfacing as flaky "no current fact exists" errors on
ctx_knowledge relateright after a successfulremember. Both writers now
go throughProjectKnowledge::mutate_lockedlike every other writer. - CI: three timing/environment flakes hardened. The
session_lock_timeoutprompt-timeout bounds (400 ms) fired falsely on loaded
Windows runners — the assertion only distinguishes "timed out" from "hung",
so the bound is now 5 s; the lock-ordering check now skips#[cfg(test)]-gated
statics (test-only locks need no production lock-ordering documentation); the
two production gateway locks from enterprise#25 (SNAPSHOT,LEDGER) are
documented inLOCK_ORDERING.md(L58/L59). max_ram_percentis now actually enforced under Cursor/MCP load — no more
75 GB OOM-kill-respawn cycles (GH #685). Two compounding gaps, both closed:
Uncontrolled build growth: the parallel BM25/graph index builds fanned the
whole corpus across the rayon pool in one shot — on a 1M+-file multi-root
setup the transient build state outran the 3 s memory guardian straight into
the kernel OOM killer. Builds now run in 2000-file batches with a guardian
check between batches (order-preserving, so indexes stay byte-identical —
equivalence-tested), a new admission gate (index_admission) degrades
corpora whose estimated peak exceeds the RSS headroom to the sequential
build up front, and extra workspace roots are indexed one at a time on a
single supervisor thread instead of up to 8 concurrent graph+BM25 pairs.
Eviction blind spots: the eviction orchestrator reasoned over session-cache
token utilization, which cannot see the HNSW/ANN graph, the resident trigram
search indexes or the materialized graph indexes — under Hard/Critical RSS
pressure it could conclude "nothing to do" while those structures dominated
RSS. RSS pressure now enforces a floor action (Hard ⇒ unload indices,
Critical ⇒ emergency drop), andUnloadIndices/EmergencyDropadditionally
clear the ANN cache (newann_cache::clear()+memory_usage_bytes()), the
resident search indexes (search_index::clear_resident()) and the graph
cache. All evicted structures rebuild transparently on next use.sed/awkfile dumps are verbatim output — no more dictionary-mangled
source (GH #688). A range-print likesed -n '10,50p' file.ps1fell into
the generic terse pipeline, whose dictionary layer word-substitutes code
identifiers with no code-awareness (function→fn,return→ret, bare
elselines dropped) — corrupting code read via sed/awk instead ofcat.
sed/awk/gawk/mawk/nawknow classify as file viewers like
cat/head/tail. In-place edits are excluded via a token-based flag check
(-i,-i.bak,-niclusters,--in-place[=suffix], gawk-i inplace) —
deliberately NOT a substring match, so filenames likemy-input.txtor
data-import.csvcan't silently re-enter the terse pipeline. Byte-exact
regression test with the original PowerShell repro. Thanks @getappz for the
report and the PR the fix is based on (#689).setupno longer panics when a client's MCP-instructions cap lands inside
a multi-byte character (GH #680). The Claude Code / CodeBuddy 2048-char
truncation used a raw byte slice; when the cut fell inside an em-dash the
whole setup crashed ("end byte index 2048 is not a char boundary",
live-reported at setup level 3, step 3/13). The cut now backs up to the
previous char boundary (truncate_instructions, unit-tested with the exact
crash shape).doctorno longer false-flags a working OpenCode install (GH #686).
Two gaps:has_lean_ctx_mcp_entryonly walkedmcp.servers.lean-ctx, but
OpenCode's schema (opencode.ai/config.json) nests servers DIRECTLY under
mcp— the direct-child form is now recognized too; and OpenCode was absent
from the SKILL.md candidate list (checked:~/.config/opencode/skills/ lean-ctx/SKILL.md) — it is now both checked by doctor AND installed by
install_all_skillswhen OpenCode is detected, so check and installer can't
drift apart.- Anchored line-1 edits of UTF-8-BOM files no longer conflict forever
(GH #683 follow-up). With ctx_read stripping the BOM (output honesty #683),
the anchor hash the model holds for line 1 is over the BOM-less text — but
ctx_patchvalidated anchors against the raw preimage, so the hashes could
never match and every retry conflicted again. The edit side now validates
against the same BOM-less view and re-prepends the BOM on write (the BOM is
an encoding artifact of the file, not of the edit). - Shell allowlist no longer splits commands at backslash-escaped operators
(GL #1160). In restricted (allowlisted) mode,rg -n split\.label\|foo src/
was split at the escaped pipe, so the pattern fragment after it was validated
— and blocked — as an unknown command (field report:rgdying with
"not in the allowlist" on regex tokens, exit 126). The operator scanner,
the subshell-paren walker and the substitution detector now honour bash
backslash semantics outside single quotes:\|,\;,\&,\(,\)and
\$(are data, never operators. Real (unescaped) pipes still split and
every segment is still validated — over-blocking removed, deny-by-default
unchanged. Also drops a dead pipe-index scanner from
check_pipe_to_bare_interpreter. - Marked-block surgery no longer eats user content when a marker is quoted
in prose (GL #1158).marked_block(and the Claude/CodeBuddy
remove_blocktwin) located<!-- lean-ctx -->markers via substring
search, so a documentation sentence like(see the `<!-- lean-ctx -->` block below)anchored the block replacement at the prose mention and
silently deleted everything down to the real end marker — live-reproduced
on this repo's own AGENTS.md, where a session-start heal wiped ~75 lines
(Development Workflow, Session Continuity, Provider Pipeline, Quality Bar).
Markers now match only as whole (trimmed) lines — the exact shape every
writer emits — and the end marker is searched strictly after the start
line, so stray end markers above the block can't create bogus spans.
All upsert/replace/remove trigger checks (hooks/mod.rs,
hooks/support.rs,rules_dedup) use the same line-based predicate;
prose mentions are now invisible to the block machinery. Regression tests
cover the exact live-repro shape.
Added
- Anchored editing end-to-end —
ctx_patchbecomes the first-class edit path
(#1008, "Edit Loop v1"). The anchored editor now closes the loop the rules
already routed: read withctx_read(mode="anchored")(or tag hits via
ctx_search(anchored=true)), then patch byline + hashanchor — the agent
never reproduces old text byte-for-byte, saving output tokens (~5x input cost)
on every edit.- Advertised where it earns its tokens:
ctx_patchjoins the lazy core
and thestandardprofile (now 16 tools). Client-aware quirks keep the
default surface lean — clients with a reliable native editor (Cursor, Zed,
Windsurf, Antigravity, OpenCode) skip it and pay zero extra schema tokens;
Claude Code, CodeBuddy, pi/SDK and headless clients get it. Pinned profiles
are client-agnostic and always include it. - Schema diet: the advertised
ctx_patchschema shrank ~625 → ~263
tokens; rarely-used params (expected_md5,backup,validate_syntax,
evidence) stay supported but are no longer advertised. op=create:ctx_patchcan create new files (strictly new — existing
files are refused; not mixable with anchored ops in one batch), so MCP-only
harnesses get the complete edit story from one tool.- Guidance coherence: Claude/CodeBuddy pointer blocks (v5/v3, keeping the
MCP-aware guard semantics of v4/v2), agent templates, skills and per-editor
guides now teach anchored-editing-first;ctx_edit(str_replace) is
documented as the legacy power-profile fallback. New troubleshooting FAQ:
"Where didctx_editgo?". - Edit-efficiency metering (honest, #361-style): a separate metric
channel measures the anchored-editing claim per applied op —
tokens(replaced span) − tokens(anchor args), i.e. output the model did
not re-emit — plus stale-anchorCONFLICTretries, against the
str_replace baseline (old_stringtokens paid,old_stringmisses).
Never estimated, never folded into the read-gain ledger, never printed in
tool bodies (#498). Surfaced inctx_metrics,/api/stats → edit_efficiencyand a dashboard ROI "Edit Efficiency" card
(~/.lean-ctx/edit_metering.json). Contract:
docs/contracts/edit-metering-v1.md. - A/B benchmark, reliability + cost: the hermetic
edit_reliability
suite fixes identical mechanical bugs across 5 languages with both tools —
anchored 10/10 vs minimal str_replace 5/10 (recovering to 10/10 only by
paying extra recalled context), and ~41% fewer argument output tokens on
identical successful fixes (tiny-span exceptions reported honestly).
- Advertised where it earns its tokens:
- Hook-aware Cursor guidance — the honest profile (GL #1153–#1157). On
hosts whose installed lean-ctx hooks already compress the native tools
(Cursor: PreToolUserewritecovers Shell,redirectcovers Read/Grep),
the injected~/.cursor/rules/lean-ctx.mdcnow carries a new
HookCoveredprofile instead of the full mapping: it states that native
Shell/Read/Grep are compressed transparently (using them is fine) and
advertises only the capabilities with no native equivalent (ctx_compose,
ctx_symbol/ctx_callgraph,ctx_semantic_search,
ctx_knowledge/ctx_session,ctx_expand). Rationale: Cursor's harness
makes native tools first-class, so a "NEVER use native" rule there is
unenforceable and only produces instruction dissonance — the model follows
neither rulebook consistently. The MCPinitializeanchor for covered
Cursor sessions is reworded the same way. Detection is conservative
(both PreToolUse entries must be present; invalid/missinghooks.json
falls back to the fullDedicatedmapping), the byte-exact drift check
re-syncs the profile when hooks are installed or removed later, and the
Cursor hook installer now honoursshadow_mode/compression_level
instead of hardcoding them (GL #1156). ~55% smaller Cursor rules payload
on hook-covered installs, billed every session. - Guard-safe re-read dedup for Claude Code / CodeBuddy (GL #1140, follow-up
to #637).read_redirect = autokeeps the read-before-write guard intact
by letting native Read run on the real path — which also forfeited the Read
dedup savings on those hosts. A newPostToolUsehook (lean-ctx hook read-dedup, matcherReadonly) wins them back without touching the guard:
the result of a re-read of an unchanged, already-read file is replaced with
a compact[unchanged]stub via the documentedupdatedToolOutputchannel.
First reads stay byte-identical (edit safety:old_stringalways comes from
real content), the incoming response shape is mirrored with only the content
field swapped (unknown shapes pass through), every failure path fails open,
replacement happens only when strictly smaller, a host compaction
(PreCompact) purges the session's records so post-compaction re-reads
deliver full content again, and Cursor's double-fired hooks are recognised by
tool_use_idso a duplicate first read is never mistaken for a re-read.
Configread_dedup = auto | on | off(envLEAN_CTX_READ_DEDUP);auto
(default) activates only on guard hosts, where the PreToolUse redirect is
off. Verified end-to-end against headlessclaude -p2.1.139: first read
byte-identical, second read served as the ~40-token stub, native Edit of the
same file still passes the read-before-write gate. - Hybrid multi-repo search (Context Hub, GL#1133).
ctx_multi_repo action=searchnow runs the full hybrid stack per root — BM25 + dense
embeddings + SPLADE boost + graph ranks, the same pipeline as single-root
semantic search — and fuses the per-root rankings with RRF (identical key and
score semantics as before, so fusion behavior is unchanged; only the per-root
signal got stronger). A root with a cold dense index degrades to its BM25
ranking with a warning instead of failing or inline-embedding under the query
(#512 semantics).mode="bm25"forces the legacy lexical-only path,
byte-identical to the previous output.
Changed
- Benchmark numbers refreshed & self-footprint made a headline metric (#659).
BENCHMARKS.mdregenerated with v3.8.18 (map 98.1% / signatures 96.7% on the
50-file corpus; cold start 2.69s → 0.67s). The README benchmarks section now
also states lean-ctx's own fixed per-session cost (~2.1K tok, CI-gated via
doctor overhead --gate) and links the deterministic dual-arm self-verify
(digestf5ed145e61ce3689) with its methodology. The CGB self-assessment
(C2 — Managed) is surfaced from the README security section and Journey 13.
Fixed
dev-installhonours redirected cargo target dirs (GH #671). Both
rust/dev-install.shand thelean-ctx dev-installcommand located the
built binary at a hardcodedtarget/release/…; withCARGO_TARGET_DIRor a
~/.cargo/config.toml[build] target-diroverride (one shared build cache
across worktrees) they silently symlinked/installed a stale or missing
binary. The target dir is now resolved viacargo metadata(env, config
files and workspace settings all honoured) with a./targetfallback, the
shell script fails loudly when the binary is absent instead of planting a
dead symlink on PATH, the Rust path gained the same resolution plus the
Windows.exesuffix, andtests/pre_release_check.shfollows suit.
Follow-up:install.sh's source-build path (served at
leanctx.com/install.sh) had the same hardcode and could link a stale
binary from an earlier default-layout build — it now resolves via
cargo metadataidentically and names the override in its error hint.
Thanks @getappz for the report and the initial
fix (#672)!- pi-lean-ctx ships with zero runtime npm dependencies (GH #670). pi
installs every package into one shared npm prefix and re-reifies the whole
tree on eachpi install/pi remove; an interrupted rewrite (Windows
AV/file locks) strandedzod/v3/locales/en.jsand the extension failed to
load — unrepairable by reinstalling, because npm never re-extracts a package
whose version matches. The MCP SDK (incl. zod) is now vendored as one
self-contained bundle (extensions/vendor/mcp-sdk.cjs, built atprepack),
so no corruptible dependency tree exists in the first place. Verified by an
isolation smoke: bundle in an empty dir, real initialize + tools/list
roundtrip, plus a jiti-loaded co-install withpi-markdown-preview. - MCP server answers
initializebefore doing housekeeping (GH #669).
Orphan-process sweep (onepsper running lean-ctx), proxy autostart (TCP
probe + detached spawn) and the throttled savings-recap publish ran in front
of the stdio transport bind — on a cold WSL2 / VS Code Server start this
widened the window in which VS Code's start-on-demand first tool call races
server readiness and dies withCannot read properties of undefined (reading 'invoke')(upstream: microsoft/vscode#321150). That work is now
deferred onto the blocking pool, concurrent with the handshake; a
time_to_initialize_mslog line makes the span measurable,lean-ctx doctorsurfaces the upstream race on WSL2 + VS Code setups, and a
regression test drives the exact race pattern (tools/call immediately after
the initialized notification) against the real binary. - Zero-config golden path:
onboard --yesnow leavesdoctorfully green.
Three healers that silently disagreed are aligned: the session-start heal
installs the agentSKILL.mdfiles alongside rules (previouslydoctor
warned "run: lean-ctx setup" forever),doctor's shell-hook probe honours a
relocatedLEAN_CTX_CONFIG_DIR(no more false "pipe guard missing"), and
setup/onboarddetect Claude Code / CodeBuddy by their state dir
(~/.claude/,~/.codebuddy/) exactly likedoctorand the rules injector
do — killing the dead loop wheredoctorpointed atsetupbutsetup
skipped the client. A new integration gate (onboard_doctor_clean) runs the
full journey in an isolatedHOMEand assertsdoctorexits green. ctx_knowledge remembernever stalls on the embedding model again. The
firstrememberon a fresh install used to block up to the 120s tool
watchdog while the ~30MB embedding model downloaded. It now uses non-blocking
engine access: the fact commits immediately, the engine warms up in the
background.- Semantic recall self-heals missing vectors. Facts written by the
consolidation/ETL writers (and byrememberwhile the engine is still
warming up) never got an embedding, and only a manualembeddings_reindex
repaired that — on a live machine most projects sat at 0 vectors, invisible
tomode=semanticrecall.remembernow backfills up to 32 missing vectors
per call (one batched inference, most-valuable-first, under the per-project
lock), so active projects converge to full coverage without any manual step. minimal_overhead=true(the default) is now documented honestly: session
continuity is delivered via theAUTO CONTEXTblock on the first tool call
(prompt-cache-friendly) instead of anACTIVE SESSIONblock at initialize.- CLAUDE.md block v4: MCP-aware guidance (GL #1138, second half of #637).
The injected CLAUDE.md/CODEBUDDY.md block recommendedctx_read-first and a
ctx_editfallback unconditionally — in sessions without a connected
lean-ctx MCP server those tools do not exist, stranding agents on shell
heredocs. The block (v4 / CodeBuddy v2, session-heal updates existing
installs) now scopes every ctx_* recommendation to "when the ctx_* MCP tools
are listed in this session", documents nativeRead→Editas the primary
editing path under the read-before-write gate, and says explicitly to use
native tools throughout when no ctx_* tools are available.doctorgains an
Instructions/MCP consistencycheck (GL #1139) that flags the hazardous
combination — instructions advertising ctx_* while no lean-ctx entry is
registered in the Claude MCP config — with alean-ctx setuprepair hint.
Security
ctx_callcan no longer bypass egress DLP or permission inheritance. The
guarded dispatch path unwrapsctx_call(name=…, args=…)and runs both checks
against the inner tool and its arguments (nestedctx_callis already
refused by the handler). Egress payload extraction is centralized in one
helper shared by the MCP server andlean-ctx policy enforce, and now also
coversctx_patchwrite bodies (new_text,new_body,ops[].new_text).
prefer_native_editor(#454) now hides/refusesctx_patchalongside
ctx_edit.- Bundled addons now spawn with a scrubbed environment (addon env isolation).
Every runnable registry addon (Headroom, Sophon, Repomix, Serena, …) now
declares a[capabilities]block. Its mere presence flips the single gateway
spawn point from the legacy "inherit the full host environment" path to the
scrubbed path (env_clear+ base allowlist), so host API keys/tokens no longer
reach an untrusted addon child process. Network/filesystem grants are declared
honestly to match each tool's real needs (registry fetch, cache/index/vault
writes) — the empty env allowlist is the isolation win. A regression test
asserts every runnable bundled addon carries a capability block.
Added
- Doc corpora as first-class retrieval sources (Context Hub, GL#1132). The
artifact index now ingests PDF (panic-safe local text extraction; a
scanned or malformed PDF becomes a warning, not a failed build), and the
artifact registry (.lean-ctx-artifacts.json) accepts absolute/~paths
so external doc folders — an Obsidian vault,~/notes— become searchable
corpora. PathJail stays the gate: external entries resolve only when
allow-listed (read_only_roots/extra_roots/LEAN_CTX_ALLOW_PATH); a
leading slash that matches an existing project path keeps its legacy
project-relative meaning. New CLI flagsemantic-search --artifactssearches
the doc corpus; new guidedocs/guides/docs-sources.md. Determinism guard:
re-indexing an unchanged corpus is byte-identical (#498). - pgvector dense backend (Context Hub, GL#1136). Teams that already operate
PostgreSQL can point the dense half of hybrid retrieval at it:
LEANCTX_PGVECTOR_URL=postgres://…(orLEANCTX_DENSE_BACKEND=pgvector)
stores embeddings in per-project, per-dimensionvector(N)tables — same
namespacing, point-id scheme and delete-by-file incremental sync as the
qdrant backend, so switching backends never mixes identities. Implemented
through thepsqlCLI (zero new crate dependencies, mirrors the postgres
provider); rows return as per-line JSON for robust parsing; identifiers and
literals are strictly validated/escaped. Theqdrant+pgvectorfeatures
join the default feature set, so release binaries support all three backends
out of the box; a live end-to-end test (pgvector_e2e_round_trip,--ignored)
verifies table creation, cosine search, incremental replace and quote-escaping
against a real pgvector container. Guide:docs/guides/dense-backends.md. - Addon registry:
qmd+memgraph-ingester(Context Hub, GL#1134). Two
community tools from the Discord retrieval thread are now 1-command installs:
qmd(on-device Markdown/notes search — BM25 + vectors + reranking, via
npx -y @tobilu/qmd@2.5.3 mcp) andmemgraph-ingester(structure-aware RAG
on a Memgraph code graph, viauvx memgraph-ingester-mcp==0.6.6; needs a
running Memgraph). Both ship scrubbed-env capability blocks; the memgraph
Bolt-URI/read-only toggles joined the reviewed env passthrough allowlist. - Docs: the context-infrastructure map (GL#1135). New
docs/guides/context-infrastructure.md(sources → one pipeline → hybrid
retrieval → OKF/ctxpkg portability → addons) and
docs/guides/dense-backends.mddocumenting the previously undocumented
Qdrant dense backend (LEANCTX_DENSE_BACKEND,LEANCTX_QDRANT_URL/_API_KEY
/_TIMEOUT_SECS/_COLLECTION_PREFIX) next to the default in-process store. - Portable OKF knowledge export/import (
knowledge export --format okf/
knowledge import <dir>,ctx_knowledge). Renders facts, patterns and typed
relations from one sharedKnowledgeSnapshotto the vendor-neutral Open
Knowledge Format (git-diffable Markdown + YAML, relations as Markdown links) or
the signed.ctxpkgbundle. Round-trips byte-identically, accepts foreign OKF
bundles, and never leaves dangling relations. Fully local and free. - Addon registry version-staleness check (
scripts/check-addon-versions.py).
Resolves every pinned upstream (PyPI / npm / NuGet / crates.io) against its
registry and reports drift as GitHub annotations. Wired into a dedicated,
non-blockingAddon Registry Freshnessworkflow (weekly + whenever the registry
changes) so a curated pin is never silently stale — and an upstream release
never breaks our own build. - Cognee is now 1-click installable (
addon add cognee). It ships a published
MCP package (cognee-mcp) and runs fully local by default (SQLite + LanceDB +
Kuzu), so it fits the standarduv tool installbootstrap; the only runtime
requirement is anLLM_API_KEY, which is passed through via a reviewed
single-entry capability allowlist (all other host env stays scrubbed). The
remaining memory/graph listings (mem0, graphiti, zep, letta, claude-context)
stay directory-only because they need external infrastructure (a vector/graph
DB, or a managed account) that a one-command install cannot provision. session newaliasessession reset(#653).lean-ctx session newnow clears
the active session just likesession reset, matching the "start a new session"
mental model; covered by a CLI characterization test.- Deterministic markdown compaction + progress-log folding in aggressive
compression (#655)..md/.markdown/.mdownreads (and.txtfiles that
carry a real ATX heading) are structurally compacted: every heading survives,
fenced code blocks are atomic (kept verbatim or dropped whole, never split by
an omission marker), and body lines are ranked by an IDF-style scorer over
ordered token sets so the output is byte-stable (#498). Shell compression now
folds repetitive cargo/pytest/package-manager progress runs into stable
markers while still honoring the verbatim token cap — diagnostics stay
verbatim, oversized logs keep safety-needle preservation. Thanks @ousatov-ua!
Changed
- RMCP SDK upgraded
1.7 → 2.0(MCP2025-11-25alignment, #656). The MCP
server/client stack now builds on rmcp 2.0:Contentis the spec-unified
ContentBlock, prompt roles use the sharedRole, resources are plain
Resourcestructs, and progress notifications use the new constructor API.
Pulls in rmcp 2.0's security fixes (OAuth resource-spoofing/metadata-SSRF
hardening, streamable-HTTP session-leak fix) and unlocks 2025-11-25 protocol
features (tool icons, URL-mode elicitation, tasks) for future releases.
Protocol negotiation with older clients (2025-06-18and earlier) is
unchanged — verified end-to-end over stdio against the 1.7 baseline (identical
tool surface, identical negotiated protocol). Client-facing roots-based
project-root auto-detection stays in place (SEP-2577 deprecation
acknowledged upstream, still fully functional). - Refreshed bundled addon pins to current upstream: Headroom
0.27.0 → 0.28.0,
Repomix1.15.0 → 1.16.0.
Fixed
- Zero-config first-session frictions closed after a fresh-install E2E audit
(#658). A scripted fresh journey (isolated$HOME, real MCP handshake like
an editor) surfaced eight frictions; all are fixed with regression tests:
auto-findings now parse the pre-decoration tool output, so the injected
--- AUTO CONTEXT ---header can no longer become a junkRead ---finding
polluting session memory and every wakeup briefing (F1);setup/onboard
--helpprints help instead of executing setup side effects (F2);
ctx_callwith misspelled keys (tool/args/params) fails with the
exact fix instead of silently dispatching without arguments (F3);
ctx_knowledge rememberderives a deterministic key slug whenkeyis
omitted and acceptscontent=as value alias — matching what our own
injected instructions document (F4); Rust call edges inside macro bodies
(println!,assert_eq!, …) are extracted at the token level, so a fresh
Rust project no longer reports0 edges(F5); the project-overview header
surfaces persisted call-graph edges instead of contradictingctx_callgraph
with0 edges(F6); barectx_knowledge recalllists recent facts instead
of erroring (F7); andctx_session showis accepted as a synonym of
status(F8). - MCP PathJail auto-corrects a stale markerless root instead of rejecting the
workspace (#649). An MCP server launched by VS Code/WSL could adopt a
markerless client cwd (e.g./mnt/c/Users/<user>) as its jail root; the first
absolute path into the real workspace on another mount was then rejected with
path escapes project root, breakingctx_compose/ctx_read/ctx_patch.
resolve_pathnow reroots opt-in-free from such a markerless root to the
marker-bearing project derived from the requested path — the same rationale as
the agent-config-dir case (#580) — while a markerless target with no derivable
project stays blocked, so PathJail enforcement is unchanged. - Local daemon IPC no longer 401s on tool calls (#651, #652). The daemon writes
an auto-generated auth token, but the IPC client (Unix domain socket / Windows
named pipe) sends noAuthorizationheader, so/v1/tools/callfailed with 401
while/healthpassed. Router construction is now split: TCP HTTP keeps Bearer
auth, while local IPC serving disables the HTTP Bearer — the socket/pipe is already
a user-local OS boundary (Unix0o600, user-specific pipe name). TCP auth is
unchanged, a regression test guards the IPC path, and a security review found no
weakening of network auth. - Codex stops reconstructing compressed shell output in chunks (#625, #654). The
SessionStart hint now states plainly that compressed output is not exact evidence
and hard-requires re-runninglean-ctx raw "<exact command>"for exact content,
forbidding chunked reconstruction (cat/sed/head/tail) and quoting
compressed output as exact — so Codex uses the reversible raw escape instead of
re-reading the compressed view piecemeal. - Enterprise/OS TLS roots are honored by every HTTP client (#643). All ureq
clients are now built throughcore::http_client, which injects
RootCerts::PlatformVerifierso requests trust the system/enterprise trust store
instead of only the bundled WebPKI roots — fixingUnknownIssuerfailures behind
TLS-intercepting corporate proxies (updates, version check, embeddings download,
Qdrant, Datadog/FinOps export, LLM enhance, SSO/billing, web fetch, webhooks). - Shell hook is quiet by default (#646). The activation notice (
lean-ctx: ON …)
no longer prints on every new interactive terminal; mode-change notices now route
through a_lean_ctx_noticehelper that speaks only whenLEAN_CTX_DEBUG=1(and
stdout is a TTY).lean-ctx-statusstill reports the current state on demand. doctorrecognizes its own running dashboard on port 3333 (#644). The
dashboard port check reported a conflict whenever port 3333 was busy — even when
the occupant was lean-ctx's own dashboard. It now probes/api/versionon bind
failure and reads the port as healthy only when the response is the dashboard's
own version JSON; unrelated services still surface the conflict. Implemented by
strengthening and reusing the dashboard's existingdashboard_respondingprobe,
so the browser-open guard anddoctorshare one source of truth.- Native Read no longer breaks Claude Code's read-before-write guard (#637).
ThePreToolUseredirect hook rewrote a nativeReadto a temp.lctxcopy, so
Claude Code's Write/Edit read-before-write guard tracked the temp path and a
follow-up native Write/Edit to the real file failed with "File has not been read
yet" — worst in headlessclaude -p, with no supported off-switch (the hook
self-healed back intosettings.json). A newread_redirect = auto | on | off
key (envLEAN_CTX_READ_REDIRECT) now governs the Read redirect and is evaluated
per hook fire, so it also covers headless runs and never fights the self-heal. The
defaultautodisables only the Read path-swap on hosts carrying that guard —
Claude Code / CodeBuddy, detected inside the hook via theCLAUDE_PROJECT_DIR
marker Claude Code exports to every hook subprocess (CLAUDECODE/CODEBUDDYare
honored too) — so native Read → Write/Edit works out of the box; thectx_readMCP
tool and the Grep/Glob redirects keep compressing.onrestores always-redirect;
offdisables the Read redirect everywhere.
Upgrade
lean-ctx update # recommended (auto-downloads + refreshes shell hooks)
cargo install lean-ctx # or
npm update -g lean-ctx-bin # or
brew upgrade lean-ctxNote: After upgrading via cargo/npm/brew, run
lean-ctx setupto refresh shell aliases.lean-ctx updatedoes this automatically.
Full Changelog: v3.9.0...v3.9.0