github yvgude/lean-ctx v3.7.4

3 hours ago

The Superintelligence Context release. All six cross-disciplinary North-Star bets are now
wired into live code: active-context prefetch that learns which providers actually help,
task-conditioned compression (an Information-Bottleneck proxy), self-managing memory that
consolidates itself in the background, a context immune system (signed audit + prompt-injection
detection), stigmergic swarm credit (per-agent heatmap + Shapley attribution), and a
physically-grounded energy and carbon ledger. Alongside the science: a heavy performance
pass — int8-quantized embeddings (turbovec), SIMD dense search, a shared file-content cache that
kills the search double-read, lazy demand-driven startup, and lossless JSON/JSONL compaction —
plus IDE permission inheritance, opt-out instruction-file injection, three new --json CLI
commands, and a batch of proxy/runtime/dashboard fixes. Everything new is free OSS; nothing is
feature-gated.

Added

  • Active-context prefetch that learns — persistent provider bandit (North-Star bet 01): ctx_preload used to instantiate its ProviderBandit fresh on every call, so it never learned which data sources were actually useful for a given kind of task. The bandit (Thompson sampling over a Beta posterior) is now persisted per project (provider_bandit.json) and closes the Active-Inference loop: task-type → prediction → execution → observation → bandit update → better future predictions. A preload that returns useful chunks is a positive signal; an empty/failed one is negative. Over time lean-ctx prefetches the providers that have repeatedly paid off for this project and stops wasting calls on the ones that don't.

  • Task-conditioned compression — an Information-Bottleneck proxy in entropy mode (North-Star bet 02): the entropy read-mode compressed purely by Shannon self-information, so a rare-but-irrelevant line was kept while a common-but-task-critical line could be dropped. When an active session intent exists, entropy_compress_task_conditioned now rescues low-entropy lines that mention task keywords — keeping what is either surprising (high H) or task-relevant (mentions the goal's concepts), and compressing away only what is both uninformative and off-task. Falls back to pure adaptive entropy when no intent is active, so non-task reads are byte-identical.

  • Context immune system — signed audit trail + prompt-injection detection (North-Star bet 04): two provenance/safety steps. (1) Audit entries are now Ed25519-signed (a signature over the chained entry_hash, keyed by the local lean-ctx identity), so a record carries cryptographic proof of which installation produced it — not just a hash chain a local writer could rebuild. (2) A conservative detect_injection heuristic scans tool output for known prompt-injection patterns (role-override like "ignore all previous instructions", role-hijack, ChatML/[INST] token smuggling, role-boundary markers). On a hit it logs a warning and emits a SecurityViolation audit event. It targets high-specificity phrases that almost never appear in legitimate source/docs, so false positives are rare (verified against real code and comments in tests).

  • Stigmergic swarm substrate — per-agent heatmap traces + Shapley context credit (North-Star bet 05): the access heatmap was agent-agnostic — every read pooled anonymously. HeatEntry now carries a per-agent access map (a stigmergic pheromone field), populated in the live read path via a canonical current_agent_id() resolver (LEAN_CTX_AGENT_ID / LCTX_AGENT_ID / local, shared with the savings ledger). A new context_credit() computes Shapley-inspired attribution: when several agents touch the same file, each contributor earns credit proportional to how many other agents also benefited — the raw signal for routing one agent toward what another already found useful, and for crediting the context that actually helped the swarm.

  • rules_injection config — opt out of touching shared instruction files (#343): a new top-level option (shared default | dedicated, env LEAN_CTX_RULES_INJECTION) controls how lean-ctx delivers its tool-mapping rules to the shared-instruction-file agents (Claude Code, Codex, OpenCode, Gemini CLI). The default shared keeps today's behavior — a marker-delimited block written into CLAUDE.md / AGENTS.md / GEMINI.md for zero-config discoverability. The new dedicated mode never edits those user-authored files; instead it uses each agent's own config-driven, fully-removable auto-load path and a lean-ctx-owned rules file:

    • Claude Code & Codex — the rules summary is injected at session start via the existing SessionStart hook's additionalContext (model-visible, nothing persisted to CLAUDE.md/AGENTS.md; any prior lean-ctx block is stripped on switch).
    • OpenCode — the dedicated ~/.config/opencode/rules/lean-ctx.md is registered (by absolute path, idempotently) in opencode.json instructions[], and the old AGENTS.md block is removed.
    • Gemini CLI — the dedicated ~/.gemini/LEANCTX.md is registered in settings.json context.fileName (seeding the default GEMINI.md so the user's own context file keeps loading), and the old GEMINI.md block is removed.
      Switching back to shared, and lean-ctx uninstall, cleanly reverse every registration (instructions[] / context.fileName collapse back to their pristine default) and delete the dedicated files — no orphaned entries. Toggling is driven entirely by the flag: the same rules sync writes a block in shared mode and an untouched user file + separate rules file in dedicated mode.
  • permission_inheritance config — lean-ctx tools honor your IDE's permission rules (community request): when lean-ctx is mounted as an MCP server its tools (notably ctx_shell) execute inside the lean-ctx process, bypassing the host IDE's own permission engine — so an OpenCode user who set bash/rm * to ask/deny would have that guard silently skipped whenever the agent reached for ctx_shell instead of the native tool. A new top-level option (off default | on, env LEAN_CTX_PERMISSION_INHERITANCE) makes lean-ctx mirror the active IDE's permission config onto its own tools. When on, before dispatch lean-ctx reads the IDE permission rules (v1: OpenCode opencode.json / opencode.jsonc, global + project merged) and applies the equivalent decision to the matching tool: ctx_shell/ctx_executebash (incl. granular git * / rm *, and top-level command patterns), ctx_read/ctx_multi_read/ctx_smart_readread, ctx_editedit, ctx_searchgrep. deny blocks the call, ask holds it back with an actionable message (MCP can't show an interactive prompt for these tools), and allow (or no matching rule) proceeds. The most specific rule wins (longest pattern; named tool beats global *), ties broken toward the more restrictive action. lean-ctx never writes the IDE's permission block — inheritance is read-only and runtime-only; the policy is cached briefly and the default (off) adds zero hot-path cost. lean-ctx doctor reports the status and, when on, how many OpenCode rules are being mirrored.

  • Self-managing memory — the cognition loop now actually runs, and feedback steers retention (North-Star bet 03): the eight-step background cognition loop (seed-promote → structural repair → lateral synthesis → contradiction resolution → hebbian strengthen → decay → compact) existed and was enabled by default (autonomy.cognition_loop_enabled, cognition_loop_interval_secs = 3600) but nothing ever triggered it outside an explicit ctx_knowledge action=cognition_loop call — so knowledge never self-organized on its own. A new core::cognition_scheduler fires it opportunistically from the MCP dispatch path: time-gated to the configured interval, single-flight (an in-flight loop is never double-spawned), panic-safe (RAII guard frees the slot), and cheap on the hot path (one config read + two atomic loads when not due). Because the server is request-driven this beats a wall-clock thread — maintenance happens exactly when there is activity to consolidate and never when the agent is idle. Additionally, the confidence-decay schedule now closes the reward loop: a fact's explicit thumbs-up/down (feedback_up/feedback_down) scales its decay — net-positive feedback keeps it longer, net-negative forgets it faster (logarithmic, capped, and floored so a single downvote never collapses a healthy fact and nothing is ever hard-deleted).

  • Thermodynamic accounting — energy and carbon avoided, surfaced in ctx_gain (North-Star bet 06): lean-ctx already estimated grid energy avoided (0.4 J/saved-token, reconciled with the website /metrics methodology) but only for display strings. The footprint is now a first-class, physically-grounded figure: core::energy adds a transparent carbon model (G_CO2_PER_KWH = 475 g/kWh — the global-average grid intensity, override-able per machine via LEAN_CTX_GRID_CO2_G_PER_KWH so cleaner grids report honestly), and GainSummary carries energy_wh + co2_grams derived from tokens_saved. ctx_gain now shows an Impact: line (… grid energy avoided | … CO₂e) and emits both fields in its JSON, so the savings ledger's environmental dividend is auditable, not just cosmetic. All figures are surfaced as estimates; nothing is persisted into the hash-chained ledger (energy is a pure function of the already-recorded saved tokens, so the tamper-evident chain is untouched).

  • Three new --json CLI commands for editor/programmatic use: lean-ctx semantic-search (fixes the editor search path), lean-ctx repomap, and lean-ctx knowledge recall all gain structured --json output so editor integrations and scripts consume results without scraping human-formatted text.

  • gain auto-publishes public metrics in the background: when gain.auto_publish is enabled, the MCP server now performs a throttled background publish of the (opt-in) public savings metrics on startup, so the leaderboard/hero stats stay current without a manual lean-ctx gain --publish. Throttled so it never publishes more than once per interval and never blocks startup.

  • dashboard --base-path for reverse-proxy subpath mounting (#355): the web dashboard can be served under a subpath (e.g. https://host/leanctx/) behind a reverse proxy; all asset and API URLs are rewritten to honor the base path.

Performance

  • Shared file-content cache removes the search double-read (#148): building the trigram search index and then answering a ctx_search query used to read the entire candidate corpus from disk twice — once to index, once to scan — and the BM25 index read it yet again. A new resident, bounded core::content_cache (LRU, invalidated by (mtime, size)) now lets the index build, ctx_search, and BM25 share a single in-memory copy per file: read once, reuse many times. Entries self-invalidate the instant a file changes on disk, the cache refuses inserts under memory pressure, and it is dropped first by the eviction orchestrator (UnloadIndices / EmergencyDrop) so it never competes with the heavier indices for headroom.
  • Lazy, demand-driven index warming on server startup (#152): the MCP server no longer kicks off a full repo graph + BM25 scan (and extra-root scans) eagerly in initialize. A session that only ever calls ctx_read / ctx_shell / ctx_tree now pays zero startup indexing cost. Each tool is classified by what it actually needs (None / Search / Heavy); the first call to a search-backed or graph-backed tool triggers a one-shot, once-per-root background warm (extra roots warmed on that same first heavy pre-warm), so the prebuilt index is ready exactly when — and only if — something uses it.
  • int8-quantized embeddings + SIMD-friendly scoring (turbovec-inspired): dense embedding vectors are stored int8-quantized, cutting the resident index memory roughly 4× and making similarity scoring SIMD-friendly. Recall is preserved within tolerance; the smaller footprint also reduces eviction pressure on the shared caches.
  • SIMD cosine + threshold-gated HNSW cache for dense search: dense/semantic search uses a SIMD cosine kernel and only builds/keeps the HNSW graph when the corpus is large enough to pay for it (threshold-gated), so small projects stay lightweight while large ones get sublinear search.
  • Read-mostly session cache + off-hot-path telemetry (#147, #149): the per-request session state is served from a read-mostly cache and telemetry/event work is moved off the hot path, removing redundant locking and disk churn from the common ctx_read flow.
  • Lossless JSON/JSONL compaction: large JSON/JSONL tool output is compacted losslessly (insignificant whitespace removed, structure preserved) before counting, so structured payloads cost fewer tokens without changing a single value.
  • Bounded cold BM25 build in the ctx_semantic_search MCP handler (#150): a first semantic search on a cold index now builds the BM25 index under a bounded budget instead of an unbounded scan, so the initial query returns promptly on large repos.
  • Proxy parses each request body once: the compressing proxy parses the request body a single time and reuses the parsed form across compression + introspection, and additionally protects multi-file read tool results from lossy command-output compression.

Changed

  • server::call_tool_guarded post-processing split into composable stages (#144): the ~1000-line guarded dispatch path is now a thin orchestrator. The self-contained, synchronous pipeline stages (budget exhaustion/warning gates, Context-IR source-kind mapping, terse-compression gating, final token-count + savings correction) live in a unit-tested server::post_process module, and the large &self-coupled side-effect blocks (tool-receipt + intent + session-save + cost attribution; shared Context-OS persist + bus events) move into named server::post_dispatch methods. Behaviour, ordering, and await points are identical — purely a maintainability/readability change with new direct unit tests for the extracted stages.
  • Tool registry is the single schema source (#141): the granular per-tool schema definitions are generated from one registry instead of being maintained in parallel, retiring a recurring source of drift between the advertised tool surface and the actual handlers (guarded by an up-to-date regression test).
  • Unified path resolution across the core (#145): project/path resolution is consolidated into one code path with a project-marker test, removing subtle inconsistencies between callers that resolved roots differently.
  • Tool descriptions steer agents to the ctx_* tools (#168): MCP tool descriptions now nudge agents toward the lean-ctx tools over native equivalents, with a regression test that fails the build if the steering language regresses.

Fixed

  • MCP advertises the full profile surface to dynamic-tools clients (#358): clients that consume the dynamic tool categories now see the complete profile-authoritative tools/list, and the always-on ctx_call gateway is exposed so no tool is unreachable for those clients (#204).
  • Proxy accepts bare provider endpoints for the OpenCode Responses API (#353): a provider base URL without the full path suffix is normalized correctly, so OpenCode's Responses-API requests are routed and compressed instead of failing.
  • macOS install/update no longer touches ~/Documents (#356): installation and update paths stop writing into ~/Documents, avoiding spurious permission prompts and stray files on macOS.
  • Dashboard info-tip tooltips never clip (#357): info-tip tooltips in the web dashboard are portaled to <body>, so they render above surrounding cards instead of being clipped by overflow.
  • Runtime robustness: bounded WAL, dead-owner lock reclaim, fact eviction & doctor thresholds (#357-adjacent runtime hardening): the write-ahead log is now bounded, locks held by dead owners are reclaimed instead of stalling, knowledge-fact eviction is corrected, and lean-ctx doctor thresholds are tuned so its health checks reflect real conditions.
  • Pi: explicit LEAN_CTX_PI_ENABLE_MCP=1 now always starts the embedded MCP bridge (#361): a lean-ctx entry in ~/.pi/agent/mcp.json (written by lean-ctx init --agent pi) no longer silently disables the embedded bridge. Pi has no native MCP support, so that entry alone never served the tools — meaning an explicit opt-in could leave both the bridge and the adapter inactive, and the session cache (with its ~13-token re-reads) never engaged. The explicit flag now wins; /lean-ctx only notes the possible-duplicate case when pi-mcp-adapter is genuinely also running.
  • Deterministic HNSW index construction: the approximate-nearest-neighbor index now seeds each node's level from its insertion index (splitmix64) instead of OS entropy, so the same corpus always builds the same graph and returns the same results. This removes run-to-run recall variance (and the flaky recall test it caused) and makes semantic-search results reproducible.
  • Dashboard graph/code-map shows a clear language message instead of an endless loading/“run index build” state (#360): for projects built from languages the code-map does not index (e.g. Lua/Luau), the Dependencies, Symbols, and Roads views now explain that the graph only supports specific languages and that BM25 search/compression still work — instead of suggesting an index rebuild that can never populate the graph. /api/graph reports the graph-supported languages plus any unsupported source languages detected in the project.

Upgrade

lean-ctx update                 # recommended (auto-downloads + refreshes shell hooks)
cargo install lean-ctx          # or
npm update -g lean-ctx-bin      # or
brew upgrade lean-ctx

Note: After upgrading via cargo/npm/brew, run lean-ctx setup to refresh shell aliases. lean-ctx update does this automatically.

Full Changelog: v3.7.4...v3.7.4

Don't miss a new lean-ctx release

NewReleases is sending notifications on new releases.