Added
- Hermes context-engine plugin +
ctx_transcript_compactcore tool — lean-ctx
can now be Hermes Agent's active context engine, not just an MCP server it
might call. The newintegrations/hermes-lean-ctxplugin is a thin Python
ContextEnginethat replaces Hermes' built-inContextCompressor: it keeps the
system preamble + a fresh tail verbatim, replaces older turns with a recoverable
summary, and injects lean-ctx's recall tools (ctx_search,ctx_semantic_search,
ctx_read,ctx_expand,ctx_knowledge,ctx_summary) natively into the agent.
Compaction itself lives in a new daemon tool,ctx_transcript_compact(the 77th
MCP tool): deterministic, prompt-cache-friendly compaction of OpenAI-format
message arrays that never splits atool_call/tool_resultpair and offloads the
raw turns into session memory. The plugin prefers this core tool over/v1and
falls back to local Python compaction when the daemon is unreachable, so the agent
loop never breaks. Includes session-lifecycle persistence (resumeon start,
ctx_summary+ctx_handoffon end), model-window presets, a runnable
head-to-head benchmark harness (vs. import-guardedContextCompressor/hermes-lcm),
and a dedicated CI job (pytest + offline benchmark smoke).lean-ctx init --agent hermesnow also points to the engine plugin.
Fixed
- High idle CPU when no session is running (#453) — on v3.8.8 (macOS, Claude
Code & OpenCode) a connected-but-idle agent pegged a whole CPU core in the
lean-ctxprocess. Asampleof the live process showed theleanctx-index
thread burning ~100% while every other thread (tokio workers,memory-guard,
main) sat parked incond_wait/nanosleep— a CPU-bound worker, not a busy
timer loop (the screenshot's "2 idle wake-ups" at 97.5% CPU confirmed it). Root
cause:LeanCtxServer::new()ran an eager full index build (graph + BM25 +
line-search) on every server start whenever a project root was detected. A
warm cache still burned ~1 core for 6–9 s per start; multiplied across two
agents and stdio respawns it never settled. Fixed comprehensively:- No eager startup build (primary fix) — the startup scan is removed; the
server falls back to the demand-driven lazy warming it already documents
(#152). A session that sits idle or only usesctx_read/ctx_shell/
ctx_treenow pays zero indexing cost (measured: idle CPU stays at 0.0%);
graph/search tools still warm their index on first use. The eager call was an
unrelated regression slipped in via #294. - stdio transport no longer respawns on a single bad frame — the codec
mapped any decode error to the sameNoneas a true EOF, so one malformed
JSON-RPC message tore down the server (rmcpQuitReason::Closed), the agent
respawned it, and the fresh process paid another index build — a CPU churn
loop. Malformed frames are now skipped (the bad frame is already consumed) and
the stream resyncs onto the next message; only a real stream end closes the
transport. - No duplicate daemons — concurrent MCP servers launching at once could all
pass theis_daemon_running()check in a TOCTOU window and each spawn a
daemon.start_daemon()now serializes that critical section with an
exclusive, bounded-wait file lock. - Leaner proxy reload — the #449 upstream-reload loop's default interval is
relaxed from 2 s to 5 s;Config::load()'s internal content-hash cache
already skips re-parsing an unchangedconfig.toml, so each idle tick is just
a small file read. - memory-guard idle backoff — RSS sampling stretches from every 3 s to
every 15 s once memory has been stably calm, and snaps back instantly under
any pressure (OOM reaction time during real work is unchanged).
- No eager startup build (primary fix) — the startup scan is removed; the
- Quick settings that "keep resetting" are now diagnosable and stable (#450) —
a value saved in the dashboard could be silently shadowed so it appeared to
revert to defaults (lite/off), andlean-ctx config validateonly said
"no config" without telling you where it looked. There are four mechanisms
and none of them was visible: an env var (LEAN_CTX_*), a project-local
.lean-ctx.tomloverride (compression_level/terse_agent/tool_profile), a
divergent resolved config dir (dashboard writes path X, runtime reads path Y),
or an unparseableconfig.tomlfalling back to defaults. Fixed by making the
provenance explicit and the path stable:config validateshows the source — it now always prints the resolved
config.tomlpath (even when missing), the layout-pin state, any parse
error, and the active env / project-local overrides, with a one-line
explanation of why a value can appear to "reset".- Dashboard surfaces provenance —
/api/settingsreturnsconfig_path,
config_exists,parse_errorand a per-settinglocal_override; the Quick
Settings panel shows whichconfig.tomlis read and warns (and disables the
toggle) when an env var or a project-local.lean-ctx.tomlis winning. - Dashboard pins the layout —
lean-ctx dashboardnow runs the same
layout_pin::heal()as the daemon/server start paths, so it can no longer
writeconfig.tomlinto a divergent dir the runtime never reads.
- Dashboard no longer times out on load; heavy index/graph routes never block (#452) —
opening the dashboard mounted ~22<cockpit-*>components that each fired
loadData()fromconnectedCallback()at once — a thundering herd of
/api/graph,/api/call-graph,/api/symbols,/api/search-indexand
/api/treerequests that ran synchronous, file-count-scaling index/graph
builds and starved the trivial/api/settingshandler until the client
aborted after 8 s ("Settings timeout"). Fixed on two layers:- Frontend lazy-load (primary fix) — components no longer load in
connectedCallback(); the router's view-loader fetches only the active
view, so#context/settingsissues a single/api/settingsrequest instead
of triggering every panel's data load at once. - Backend single-flight + non-blocking (hardening) —
graph_indexand
bm25_indexgained aget_or_start_buildcoordinator (one background build
per root, concurrent callers deduplicated) modeled oncall_graph. Heavy
routes (/api/tree,/api/symbols,/api/call-graph,/api/search-index,
/api/search) now return202 {status:"building"}with progress instead of
blocking on a full scan; the affected panels poll and show an
"index building…" state until the build completes.
- Frontend lazy-load (primary fix) — components no longer load in
ctx_shellis clearly labelled and runs profile-free (#451) —- Pi renderer — the Pi extension rendered shell calls with a bare
$
prefix (inherited from Pi's bash renderer), makingctx_shelllook like a
native interactive bash shell. It now renders an explicitctx_shelllabel. - Profile-free shell —
ctx_shell(MCPexecute_command_with_env) and the
CLIlean-ctx -cpaths now neutralize inheritedBASH_ENV/ENVso a
non-interactivesh -c/bash -ccan no longer be hijacked into sourcing a
profile/rc file (e.g. anexec nusnippet silently replacing the shell).
Shell behavior is now deterministic and independent of user shell config. - Sharper description — the tool description (MCP and Pi) states it runs
the system shell ($SHELL) profile-free, so agents stop treating it as a
config-loaded interactive bash.
- Pi renderer — the Pi extension rendered shell calls with a bare
- Proxy upstream is now live from
config.toml— no more stale upstream on a long-lived proxy (#449) —
the proxy froze its provider upstreams inProxyStateat startup and never
re-read them, so a laterlean-ctx config set proxy.openai_upstream …(or any
config.tomledit) had no effect until a manual restart — and a shell
export LEAN_CTX_OPENAI_UPSTREAM=…could never reach an already-running,
service-managed proxy at all (the env simply does not propagate into a running
process). Now:- Live reload — a background task re-resolves the upstreams from
config.tomlevery ~5s (LEAN_CTX_PROXY_RELOAD_SECSto tune) and publishes
any change through atokio::sync::watchchannel that every provider handler
reads per request, soconfig settakes effect on the running proxy within
seconds, without a restart. An invalid value keeps the last good upstream
instead of silently dropping to the provider default. config.tomlis the source of truth for long-lived proxies; a
LEAN_CTX_*_UPSTREAMenv var remains a start-time override only (it cannot
reach a process that is already running). MCP hosts make this acute: Codex
(and others) launch the lean-ctx MCP server with a stripped, allowlisted
environment that omitsLEAN_CTX_*_UPSTREAM, so the proxy it spawns never
sees it —config.tomlis the only mechanism that reaches every proxy.- Root cause for service/MCP-managed proxies — directory pinning — a
launchd-spawned proxy inherits only launchd's minimal environment (noHOME,
no XDG vars) and so resolved a different config/data dir than the CLI: it
never read the user'sconfig.toml(live reload had nothing to read) and
derived a mismatched session token (its/status401'd). The proxy/daemon
LaunchAgent plists now bake in the exactHOME+LEAN_CTX_{CONFIG,DATA,STATE,CACHE}_DIR
the installing CLI resolves, so a managed process always agrees with the CLI. - Observability —
/statusandlean-ctx proxy statusnow report the
active upstreams;proxy statusderives liveness from the public/health
endpoint (so a running proxy is never misreported as down) and warns in two
cases: aLEAN_CTX_*_UPSTREAMset in the shell that never reached the proxy
(with the exactconfig setcommand to persist it), and a proxy started with
an env override now masking a laterconfig.tomledit.doctorcarries the
same drift check. lean-ctx proxy restart— new subcommand that cleanly restarts the
managed service (re-readsconfig.toml, drops any start-time env override).
- Live reload — a background task re-resolves the upstreams from
ctx_impactresolves C# extension-method hosts and disambiguates types by namespace (GH #398 follow-ups, #640–#643) —
the two deferred #398 follow-ups are now closed:- Extension methods (#642) — a call
value.WordCount()to a C# extension
method (static int WordCount(this string s)) names neither the defining
static class nor any of its types, so it produced no edge and left the host
a false-negative leaf. A newdeep_queries::ext_methodsextractor collects
this-parameter methods, andctx_impactlinks eachvalue.Foo()call to
the defining file (file + symbolTypeRefedge), self-filtered and capped. - Namespace-aware resolution (#641) —
TypeDefnow carries its C#
namespace (block-, file-scoped and nested), andtype_ref_targetsresolves
hybridly: a definer in the consumer's visible namespace (own namespace +
enclosing namespaces +usings) always links — even past the cap — and its
homonyms in other namespaces are dropped, so same-named types are no longer
conflated. With no namespace match the global fallback still links, with the
too-generic cap raised 3 → 5. Java (no namespaces) keeps the fallback path.
Both capabilities are wired into the embeddings and minimal builder paths;
all new regressions are gated ontree-sitterso they exercise both. Outputs
stay deterministic (sorted/deduped, bounded indexes; #498).
- Extension methods (#642) — a call
ctx_impactnow sees C# types used only in expression position (GH #398 follow-up) —
the v3.8.3 fix linked same-namespace C# consumers to definers for types in
declaration positions (fields, parameters, return types,base_list,
generics, casts,typeof), but a type referenced only in expression
position still produced noTypeRefedge, soctx_impactreported the
defining file as a false-negative leaf. Now covered in
deep_queries::type_uses: static calls/fields and enum values via a
member-access receiver (Engine.Create(),Engine.Default,Status.Active)
and attributes ([ApiController], which additionally resolves to the
…Attributeclass name). Only PascalCase receivers are collected and the
existing def-index resolution discards any name that is not a real project
type, so precision is unchanged. The new end-to-end regression is gated on
tree-sitterrather thanembeddings, so it also exercises the
index_graph_file_minimalbuilder path that the earlier #398 e2e tests never
reached. (Extension-method hosts and namespace-aware resolution were the
remaining follow-ups, now closed above.)lean-ctx update/config init --fullno longer reset or leak config values (#443) —
persisting a single setting could silently rewrite other customized keys in the
globalconfig.toml(e.g.compression_level→lite,max_ram_percent→ 5).
Three root causes, now closed by construction:- (A) default-seed clobber —
config init --fullhistorically wrote
Config::default(), andsave()overwrites every key present in both the
incoming document and the file (config_io::merge_table), resetting customized
values. (Already mitigated viaconfig_for_full_init; now superseded.) - (B) project-local leak —
Config::load()folds project-local
.lean-ctx.tomloverrides into the in-memory struct, so the common
load() → mutate → save()pattern (18 call sites across 10 files) wrote those
per-project values back into the global file. - (C) corrupt-file clobber —
write_toml_preserving_minimalwrote a fresh
document when the existing file failed to parse, discarding a hand-broken config.
The fix introduces a leak-free persistence API —Config::load_global()(reads
the global file only, never merging project-local overrides) and
Config::update_global()(read global-only → mutate → minimal save, and refuses
to touch an unparseable file) — and migrates every persist site to it. The runtime
read path (Config::load(), with project-local merge) is unchanged. In addition,
write_toml_preserving_minimalnow refuses to overwrite an unparseable config
instead of clobbering it, andconfig init --fullemits a fully annotated
reference document seeded with the user's current values (lossless round-trip,
independent of schema completeness).
- (A) default-seed clobber —
- XDG layout no longer flips back to
~/.lean-ctx(GL #623) — once an install
resolved to the XDG four-dir layout, a single stray marker appearing in
~/.lean-ctx(a legacy residue, a restored backup, a concurrent older binary,
even an emptysessions/) silently re-collapsed config/data/state/cache onto
that one directory viasingle_dir_override, after whichconfig.tomlwas no
longer found and the dashboard graph disappeared (data had moved to
$XDG_DATA_HOME/lean-ctx/graphs). A new layout pin
($XDG_CONFIG_HOME/lean-ctx/layout.toml,mode = "xdg") records the
commitment: the resolver reads it before the legacy/mixed heuristic and never
re-adopts~/.lean-ctxfor a pinned install. The pin is written (and a
residual~/.lean-ctxauto-drained) by every independent long-running writer
and repair path —setup, the MCP server start, the daemon
(init_foreground_daemon, incl. the launchd/systemd autostart), and
doctor --fix(after it migrates + reclaims). Marker detection was hardened so an empty
sessions//graphs/directory (or a zero-bytestats.json) no longer counts
as data, and the Docker self-heal shell hook no longer touches~/.lean-ctx
(heal timestamp →$XDG_STATE_HOME, lock count →$XDG_DATA_HOME).doctor
now reports the active layout mode (xdg-pinned/single-dir / legacy). - Re-reads stop blowing up to full content (cache hit-rate regression) — with
modeomitted (the recommended usage), a file first read in a compressed mode
(map/signatures) was resolved tofullon its second read by the
cache_hitshortcut, even though full content had never been delivered
(full_content_delivered=false). The 2nd read therefore re-delivered the
entire file — more tokens than the first read — a compression bounce that
also meant stub hits only began at the 3rd read, which agents rarely reach.
Measured lifetime cache hit-rate had collapsed to ~5% (down from ~90%). The
resolver now only short-circuits tofullonce full content was actually
delivered; otherwise it falls through to the predictor, which reproduces the
cached compressed mode and serves it from the compressed-output cache as a
cheap, consistent hit. Explicitmode="full"reads (for editing) are
unchanged. - Cache-aware pruning no longer churns the cached prompt prefix (#448) — on
cache-metered rails (Anthropic), the defaultcache-awarehistory pruner
rewrote already-cached history every time the prune boundary advanced a
STRIDE(~every 16 messages), invalidating the provider prompt-cache prefix
from the first changed message and re-billing cheap reads (0.1x) as writes
(1.25x). Pruning now skips the client'scache_control-marked prefix and only
ever rewrites not-yet-cached content, so a growing conversation keeps hitting
the cache. Per-message tool-result compression is unchanged (it is
content-deterministic and prefix-stable), and requests withoutcache_control
(e.g. OpenAI) are byte-for-byte unaffected. ctx_retrieve/ctx_shareno longer serve stale cached content — both
paths returned the cached full content for a file (get_full_content) with
no staleness check, so an agent that retrieved a file — or received one via
a cross-agentctx_sharehandover — could be handed a version that no longer
matched disk if the file had been edited since it was first read. This is the
classic handover failure: agent A edits a handover file, agent B reads the
pre-edit cached copy and "does not see the changes".ctx_readwas already
safe (it revalidates by mtime and content hash and re-reads on any
mismatch); the two retrieve/share accessors bypassed that guard. Both now go
through a new staleness-safe accessor (SessionCache::current_full_content)
that validates the cached entry against disk (mtime + hash) and transparently
re-reads the current bytes when the cache is behind the file, so a retrieve or
handover always reflects the latest content.
Upgrade
lean-ctx update # recommended (auto-downloads + refreshes shell hooks)
cargo install lean-ctx # or
npm update -g lean-ctx-bin # or
brew upgrade lean-ctxNote: After upgrading via cargo/npm/brew, run
lean-ctx setupto refresh shell aliases.lean-ctx updatedoes this automatically.
Full Changelog: v3.8.9...v3.8.9