github yvgude/lean-ctx v3.8.0

latest release: v3.8.1
2 hours ago

The Governance & Proof release. Agents become accountable identities,
context gets enforceable policy, and savings become auditable evidence:
Ed25519-bound agent registry, deterministic evidence bundles with an
offline verifier, EU AI Act / ISO 42001 / SOC 2 coverage reports, context
policy packs, org SSO (OIDC) + org audit log, and a FinOps surface that
exports the signed ledger to Datadog, CloudZero, Vantage and FOCUS.
The Context OS opens up — WASM extensions, personas, plugin tools,
Python/TS/Rust SDKs with a lockstep conformance matrix — while the
dashboard reorganizes around the four jobs (decides · remembers · guards ·
proves). Underneath: a P0 security hardening series, attribute-safe
dashboard escaping, MCP failures that finally set isError (#389),
a cache-aware proxy that stops defeating provider prompt caching (#534),
and a long tail of field-reported crash and correctness fixes.

Added

  • First-class agent identities (GL #433, H3 Epic D):
    core/agent_registry.rs + lean-ctx agent register/list/show/heartbeat/suspend/resume/decommission/offboard-owner/check.
    Agents become registered identities with a mandatory human owner
    (accountability principle), Ed25519 key binding, lifecycle states
    (decommission is final and audit-closed), best-effort attestation
    (binary + role-config hash, drift surfaces on heartbeat with exit 3)
    and SPIFFE-compatible workload ids
    (spiffe://<domain>/agent/<role>/<id>). Owner offboarding suspends all
    of an owner's active agents in one locked transaction (SCIM hook for
    ENT-2); every transition writes tamper-evident audit entries via four
    new additive OCP Part 4 event types. Registry is cross-process safe
    (advisory file lock). Docs: docs/enterprise/agent-identity.md with an
    honest attestation threat model.
  • Evidence Bundle v1 + standalone offline verifier (GL #425, H3
    Epic A): lean-ctx audit evidence --from --to [--framework] exports a
    deterministic ZIP (evidence-bundle-v1 contract) — audit-chain segment,
    resolved policy pack, CGB + framework coverage reports, Ed25519-signed
    manifest; identical inputs produce byte-identical bundles. New
    independent verifier packages/leanctx-verify (no engine code, no
    network, 4 deps) replays the hash chain and validates signatures in
    five auditor-readable PASS/FAIL steps; mutation tests prove 1-byte
    flips, truncation and wrong keys are detected. Auditor guide:
    docs/enterprise/reading-evidence.md.
  • Framework compliance reports — EU AI Act, ISO 42001, SOC 2
    (GL #424, H3 Epic A): machine-readable mapping matrices under
    compliance/mappings/*.toml (framework-edition pinned, semi-annual
    review cycle, explicit residual gaps) and three new builtin policy packs
    implementing the enforceable slice of each framework
    (eu-ai-act-deployer, iso42001-aligned, soc2-context). New
    lean-ctx policy coverage --framework <id> [pack] renders the
    audit-conversation artifact: every control as
    ENFORCED (live-verified against the resolved pack) / ENGINE (CI-proven
    guarantee) / GAP (documented organisational duty) — for the EU AI Act
    reference setup that is 11 of 14 controls technically enforced. Honesty
    is mechanized: every full claim must name a CI test
    (tests/compliance_frameworks.rs proves enforcement AND that violations
    are detectable — tampered logs fail verification, weak packs downgrade
    to NOT-ENFORCED), and a drift test fails the build when claims and tests
    diverge. Not legal advice; aligned ≠ certified.
  • Business plan — $149/mo flat, self-serve governance (GL #533,
    contract billing-plane-v3): new tier between Team and Enterprise with
    50 flat seats, 20 GB hosted index, 10 managed connectors, private
    registry, org SSO via OIDC (new sso_oidc entitlement key, additive
    on every plan) and 365-day audit retention. Self-serve via
    lean-ctx cloud upgrade --plan business; existing subscribers are
    switched in place (prorated) instead of double-billed. SAML/SCIM
    (sso_scim) stays Enterprise. billing-plane-v1 remains frozen — v3 is
    a purely additive catalog delta.
  • Datadog/Prometheus FinOps export — metrics contract + scrape token
    (GL #401): /metrics now exposes verified ledger savings
    (lean_ctx_ledger_tokens_saved_total, lean_ctx_cost_saved_usd_total,
    30 s cache over the hash-chained ledger) and a lean_ctx_info series
    carrying project/profile/agent_role/model/version tags
    (kube-state-metrics _info idiom — one series per process, no
    cardinality explosion). New LEAN_CTX_SCRAPE_TOKEN env: a read-only
    Bearer token valid only for GET /metrics, so monitoring agents
    never hold the dashboard credential. The exposition surface is frozen in
    docs/reference/metrics-contract.json, enforced by
    rust/tests/metrics_contract.rs (update via
    LEANCTX_UPDATE_METRICS_CONTRACT=1). Ready-to-import Datadog assets:
    integrations/datadog/ (OpenMetrics conf.yaml, Token-Economy
    dashboard, savings-drop + SLO-violation monitors), guide:
    docs/integrations/datadog.md.
  • lean-ctx finops export — CloudZero, Vantage & FOCUS cost export
    (GL #402): turns the hash-chained savings ledger into daily showback rows
    (day × project × agent × model × tool) with the model price pinned per
    event — no pricing table to maintain, reproducible forever. Targets:
    --target=focus (FOCUS 1.2 CSV, all 21 Mandatory columns + 1.0 compat
    set, passes the official FinOps Foundation focus-validator),
    --target=cbf (CloudZero AnyCost; --upload posts per-month Stream
    drops with replace_drop = idempotent re-runs), --target=vantage
    (custom-provider CSV; --upload posts multipart, additive semantics
    documented). Savings are emitted as Credit/Discount rows with
    negative cost — Usage spend stays clean for budgets. Guide:
    docs/integrations/finops.md.
  • Agentless Datadog push (GL #401): opt-in direct submit to the Datadog
    Metrics API v2 — LEAN_CTX_DATADOG_PUSH=1 and DD_API_KEY required
    (a stray API key alone never enables egress), DD_SITE +
    LEAN_CTX_DATADOG_INTERVAL_SECS optional. Counters go out as
    per-interval deltas (baseline cycle first — lifetime totals never spike
    a graph), gauges every cycle, all series tagged
    project/profile/agent_role/model/version. Runs as a background loop in
    lean-ctx dashboard.
  • Quality loop v1 — edit failures teach mode selection (GL #494):
    ctx_edit outcomes are now correlated with the last read mode of the
    file. An old_string miss after a compressed read (a) escalates the
    next auto read of that file to full (one-shot, 1 h TTL) and (b) feeds
    a per-(extension × mode) failure rate; pairs crossing the documented
    risky threshold (≥2 fails and ≥25 % fail rate, hysteresis exit <15 %)
    resolve to full until they recover. New resolver sources
    edit_fail_escalation / edit_quality_penalty, persisted in
    ~/.lean-ctx/edit_quality.json (bounded, 30 d decay), surfaced in
    ctx_metrics under "Edit quality". Contract:
    docs/contracts/quality-loop-v1.md. Golden test:
    rust/tests/quality_loop_golden.rs.
  • ctxpkg hosted registry — client side (GL #406): lean-ctx pack publish is real — preflight (parse, ed25519 signature, scoped-name
    check) then PUT to the registry at ctxpkg.com with a ctxp_… token
    (--token/CTXPKG_TOKEN). lean-ctx pack install ns/name[@version]
    resolves, downloads, verifies the artifact SHA-256 against the index,
    runs the standard import gates, re-verifies the signature locally and
    pins the result in .lean-ctx/ctxpkg.lock. lean-ctx pack export --sign signs bundles with an auto-managed ed25519 key
    (~/.lean-ctx/keys/ctxpkg-ed25519.key, 0600). Edge: account routes for
    namespace claim + publish-token lifecycle. Contract:
    docs/contracts/ctxpkg-registry-v1.md.
  • lean-ctx policy coverage — automated partial CGB assessment
    (GL #426): statically grades a resolved policy pack against the Context
    Governance Benchmark v1.0-draft — credential fixtures vs. redaction
    patterns, regulated-identifier classes, budget cap, retention, tool
    posture, egress restriction. PASS/FAIL/INCONCLUSIVE per aspect, --json
    for CI gating (exit 1 on FAIL), and an explicit honesty line instead of a
    maturity grade: 7 of 32 controls are statically checkable, the rest need
    the manual assessment.
  • Context Governance Benchmark — spec + self-assessment (GL #426): CGB
    v1.0-draft published as its own tool-neutral spec repo
    (context-governance-benchmark): 32 measurable controls in 6 domains
    (sensitivity/redaction, provenance, budget, audit/evidence, access
    scoping, lifecycle/retention), three levels (Basic/Hardened/Audited),
    maturity grades C1–C4, CC-BY-4.0, RFC-light governance and a CI wordlist
    lint that bans product names from normative text. LeanCTX's own honest
    self-assessment lands in docs/compliance/cgb-self-assessment.md:
    C2 — Managed (Basic 96%, Hardened 80%, Audited 50%), with declared
    gaps incl. no independent redaction verification and no one-step egress
    inventory — graded down where claims couldn't be hard-verified.
  • Dashboard: one tabbed page per job area (GL #487, Redesign P2): the
    sidebar now carries six destinations — Home plus one entry per four-jobs
    area (Context, Memory, Protection, Proof, Project Map) — and each area is a
    single page whose views are tabs with canonical #area/tab deep links
    (#context/triage, #proof/roi, …). Every pre-#487 hash (#live,
    #health, #graph, …) still resolves and is rewritten to its canonical
    form; the last-used tab per area is remembered. New Protection area: the
    Guards tab hosts the existing reliability view, the new Risk & Policies tab
    shows live session-risk warnings (/api/context-risk) and the OWASP
    agentic-risk coverage map served by the new /api/owasp endpoint (same data
    as lean-ctx audit). The in-component Project-Map tab bar was removed in
    favour of the area strip.
  • Dashboard: four-jobs language pass (GL #488, Redesign P3): onboarding
    modal tells the four-jobs story (decides · remembers · guards · proves)
    with token savings framed as the receipt, includes Protection, and the
    status bar links the estimated figure to the signed ledger in Proof.
  • Agent-task benchmark v1 harness (GL #493): outcome evidence instead of
    token arithmetic — does lean-ctx change task success rate and cost per
    solved task? bench/agent-task/ runs two identical Claude-Code-headless
    arms (native vs. lean-ctx MCP, fresh HOME per run, hard-pinned MCP surface
    via --strict-mcp-config) over a deterministic SWE-bench-Verified subset
    (sorted round-robin by repo, frozen as tasks.lock.json), judged by the
    official SWE-bench evaluation; usage/cost come from the runtime's own final
    report — nothing is estimated. Pre-registered protocol with numbered
    amendments (PROTOCOL.md), self-hashing result artifact ready for
    ssh-keygen -Y sign; negative results publish unchanged.
  • LoCoMo memory benchmark harness (#291): a model-free, deterministic
    retrieval-recall benchmark over LoCoMo-style long conversations — every
    turn is stored as a memory, every question recalls top-k and is scored
    against the gold answers (answer containment, token-F1, exact match,
    recalled-context vs. full-transcript tokens). Ships a committed
    reference-suite with publishable numbers (benchmark/locomo/LOCOMO.md:
    100% containment@5 at 29.4% token reduction), a locomo_bench binary for
    full-dataset runs, and a CI smoke test.
  • Context policy packs (GL #489): governance presets as code. A pack pins
    a team's context-governance expectations in reviewable TOML — default read
    mode, allowed/denied tools, named redaction regexes, audit-retention
    expectation, context-budget cap — with single inheritance (extends) whose
    semantics are security-first: denies and redaction accumulate down the
    chain, scalars override, allowlists replace deliberately. Five curated
    built-ins ship embedded (baseline, strict-redaction, finance-eu,
    healthcare, open-source); lean-ctx policy list|show|validate lists,
    resolves and lints packs (project pack: .lean-ctx/policy.toml). v1 is the
    format + tooling; runtime enforcement follows. Contract:
    docs/contracts/context-policy-packs-v1.md; guide:
    docs/guides/policy-packs.md.
  • Org audit log + retention (GL #484): a unified, append-only governance
    audit log for orgs, surfaced to the owner at /account/audit with a
    filterable table and CSV export. Every governance path now writes
    best-effort events (SSO config/verify/enforce/remove/login, invite
    create/redeem/revoke) into one org_audit_log; the retired SSO-only table
    is migrated and dropped by an idempotent boot migration. Retention is the
    owner-plan window from the billing-plane-v1 SSOT (Team 90 days, Enterprise
    ~10 years) and is enforced server-side both by a daily fleet sweep and on
    read, so an owner never sees a row older than they're entitled to keep. Reads
    are owner-only, cursor-paginated, and bounded. Contract:
    docs/contracts/org-audit-log-v1.md.
  • Org SSO (OIDC) (GL #482): self-serve single sign-on for Team and
    Enterprise orgs. Owners configure an OIDC provider (Okta, Entra ID, Google
    Workspace, any compliant OP) under Account → Billing, prove domain ownership
    via a DNS-TXT record (checked over DNS-over-HTTPS), and optionally require
    SSO for everyone — the owner stays password-exempt (break-glass). Members
    click Continue with SSO, authenticate at the IdP, and land in a normal
    session with just-in-time user + org-membership provisioning. Edge runs the
    Relying Party (Authorization Code + PKCE, discovery/JWKS cache, ID-token
    verification with nonce binding and HS*/none rejection); the control plane
    is the system of record (AEAD-sealed client secret, append-only
    billing_sso_audit). API keys never touch URLs — a single-use 60-second
    handoff code carries the session to the browser. Contract:
    docs/contracts/org-sso-oidc-v1.md; setup guide:
    docs/guides/org-sso-setup.md.
  • Team invite links (GL #385): owners mint one-time links
    (leanctx.com/join/?code=…) instead of copy-pasting tokens. Codes are
    256-bit, stored hashed, expire after 7 days, and redeem exactly once
    (atomic claim; a failed seat check releases the claim for retry). The
    public join page issues the member token once, with prefilled CLI + MCP
    setup snippets; pending invites are revocable from the dashboard like
    member tokens. Redeem endpoint is rate-limited per IP and answers every
    dead code with one neutral 404. Contract:
    docs/contracts/team-invite-links-v1.md.
  • Device overview (GL #387): every authenticated Personal-Cloud push now
    carries an X-Device-Label header (the machine's hostname), tracked
    server-side as fire-and-forget display metadata — never auth, quota, or
    billing input. /account/cloud lists each machine with last sync, last
    surface and push count, plus a per-row Forget control
    (GET/DELETE /api/account/devices). Contract:
    docs/contracts/device-overview-v1.md.
  • Supporters wall + dashboard badge (GL #393): the public supporters wall
    is live end-to-end — Stripe checkout fields (display name, message, opt-in)
    are captured idempotently by the billing webhook, clamped to 60/140 chars,
    profanity-gated and served via the public GET /api/supporters edge;
    leanctx.com/support/ renders the wall client-side (plaintext-only,
    tier pills, newest first). Cancelling the subscription hides the entry on
    the next subscription.deleted webhook, and an internal-key moderation API
    (GET …/supporters/moderation, PATCH …/supporters/{id}) provides an
    audited kill-switch. Locally, the dashboard's support bar now swaps its ask
    for a thank-you when the machine is linked to a supporting account — served
    by the new /api/billing-badge endpoint from the cached plan only (no
    network, purely cosmetic, never gates a local capability).
  • Email digests (GL #386): the cloud server now sends a monthly Pro digest
    (tokens saved, agent actions, sessions, CEP score — from synced snapshots)
    and a weekly Team digest (net tokens, USD, actions, top model/tool — from
    the hosted server's savings summary). Idempotent per period with automatic
    catch-up and SMTP retry; silent when a period has no real data. Every email
    carries a one-click, login-free unsubscribe (hashed, rotating tokens);
    GET/PUT /api/account/digest exposes the preference to the dashboard.
    Contract: docs/contracts/email-digest-v1.md. Cloud-server CORS now allows
    PUT/PATCH (digest toggle + team settings).
  • Weekly team-ROI webhook (GL #388): team servers post a weekly savings
    summary (net tokens, USD, measured actions, 7-day window, top mover, top
    model/tool) to Slack, Discord, or any JSON webhook. Configured via
    roiWebhookUrl in team.json (https-only, validated at boot) or self-serve
    through the team dashboard's new Integrations card
    (PUT /api/account/team/settings → control plane re-renders the config).
    Posts once per ISO week with retry-on-failure; weeks without reported data
    stay silent — no synthetic numbers. Payload shape auto-detects the vendor
    (Slack text, Discord content, generic both).
  • Per-member savings drilldown (GL #389): new audit-scoped team-server
    endpoint GET /v1/savings/member/{signer} — one member's latest totals,
    model/tool breakdowns and a member-only 90-day cumulative series (carry-
    forward replay of that signer's snapshot history). Signer ids are validated
    against [A-Za-z0-9_-]{1,64} before any filesystem access; unknown signers
    are a clean 404. Proxied through the control plane
    (/api/billing/team/{id}/savings/member/{signer}) and the account edge
    (/api/account/team/savings/member/{signer}); the team dashboard's member
    rows are now clickable and open an inline drilldown panel (own series chart,
    top models, top tools). Contract: docs/contracts/billing-plane-v2.md.
  • model2vec static-embedding support (GL #452): the embedding engine now
    drives EmbeddingBag-topology ONNX graphs (model2vec exports like
    hf:minishlab/potion-base-8M) next to classic transformers. Topology is
    detected from the graph's input signature (input_ids + offsets) at load
    time; the adapter feeds flat ids + batch offsets, skips mean-pooling (the
    graph pools internally) and probes dimensions off the rank-2 output. ~500x
    faster inference at ~30 MB — built for initial indexing of large repos and
    semantic search on weak hardware. Live-verified end-to-end (256d, L2-normed,
    semantic sanity); guide section in docs/guides/custom-embeddings.md.
  • Minimal org model on the cloud plane (GL #468): team checkouts now
    create an organization with the buyer as owner; memberships inherit the
    owners' best active plan at the entitlements edge (never downgrading a
    personal plan) and /api/account/entitlements carries the org
    {id, name, role} for the dashboard's new organization section.
  • Zero-knowledge Personal Cloud vaults (GL #467): knowledge and gotchas
    now sync as client-side-encrypted blobs (XChaCha20-Poly1305, domain-separated
    HKDF keys knowledge-vault-v1 / gotcha-vault-v1 derived from the account
    API key the server only stores hashed). The first vault push purges the
    account's legacy plaintext rows; dashboards read the client-declared
    entry_count from blob metadata. Contract:
    docs/contracts/personal-cloud-encryption-v1.md.
  • Team server billing-plane endpoints (GL #463): GET /v1/storage reports
    the hosted workspace footprint (allocated-blocks sizing, hard links counted
    once, symlinks never followed, 60 s cache; camelCase per
    billing-plane-v2) and GET /v1/usage serves the unified snapshot —
    signed-ledger savings roll-up, measured toolCalls, and a snake_case
    storage block. Both audit-scope-gated like /v1/metrics; quota via
    LEANCTX_TEAM_STORAGE_QUOTA_BYTES. Unblocks the control plane's hourly
    Stripe metering job and threshold mails against real team servers.
  • lean-ctx doctor --migrate-check (GL #396): v1.0 migration-readiness
    audit — config.toml keys validated against the schema (free-form sections
    like ide_paths respected), active deprecations, data-layout writability,
    frozen-contract set. --json for fleet rollouts; exit 0 = "ready for 1.0".
    Plus the launch program docs: docs/releases/v1.0-runbook.md (RC/freeze/
    bug-bash/rollback/launch-day plan), docs/releases/migration-1.0.md
    (zero-breaking-changes guide) and marketing/launch-v1/ (Show HN + Product
    Hunt drafts with tokbench-informed Q&A prep).
  • Custom embedding models (GL #397, upstream #328): ctx_semantic_search can now
    load any HuggingFace repo with an ONNX export via model = "hf:org/repo[@revision]"
    ([embedding] in config.toml or LEAN_CTX_EMBEDDING_MODEL). Includes revision
    pinning with an unpinned-warning, automatic dimension probing from the ONNX graph
    ([embedding].dimensions as declared fallback), per-repo+revision storage isolation,
    and SHA-256 lockfiles (model.lock.json, trust-on-first-use) that reject silent
    upstream content swaps. Model or revision changes trigger the established one-shot
    re-index. New guide: docs/guides/custom-embeddings.md.
  • SDK conformance matrix (GL #395): all three first-party SDKs (leanctx
    on PyPI, @leanctx/sdk on npm, lean-ctx-client on crates.io) now cover the
    entire public /v1 surface — added context_summary, search_events,
    event_lineage and metrics to every client. The shared conformance kit
    grows from 4 to 14 lockstep checks, including two drift gates:
    route_coverage (a server route without an SDK method fails within one CI
    run) and engine_compat (SDK declares its supported http_mcp contract
    versions). New CI job sdk-conformance runs all three kits against a real
    lean-ctx serve build via scripts/sdk-conformance.sh and publishes
    docs/reference/sdk-conformance-matrix.md (current state: 3/3 SDKs,
    14/14 checks PASS). SDK majors follow the engine contract major.
    Completing the audit: live adapter smoke tests (OpenAI/LangChain/
    LlamaIndex/CrewAI run one real tool round trip each against the live
    server, optional frameworks skip cleanly) and a release gate
    (scripts/check-sdk-versions.py, first job of the release workflow):
    an engine release fails hard when an SDK cannot speak the shipped
    http_mcp contract version, and warns on >1 minor SDK-family drift.
  • Contract freeze & SemVer/deprecation policy (GL #394): all 29 contract
    docs are now classified frozen / stable / experimental in a stability
    matrix (CONTRACTS.md, SSOT core/contracts.rs::contract_docs()). Two new CI
    gates enforce the freeze: tests/contracts_frozen.rs (every doc classified;
    frozen docs content-hashed against docs/contracts/frozen-hashes.json
    semantic changes must land as a new -v2.md file) and
    tests/openapi_stability.rs (public /v1 surface vs.
    docs/reference/openapi-v1.snapshot.json; additive diffs pass, removed or
    mutated routes fail). GET /v1/capabilities additionally returns a
    contract_status map so clients can verify stability guarantees at runtime.
    The deprecation register DEPRECATIONS.toml (compiled into the binary,
    ≥ 2 minor releases between announcement and removal) feeds a new
    lean-ctx doctor check that warns about every deprecation shipping in the
    installed build.
  • Personal-Cloud auto-push (GL #384): opt-in lean-ctx cloud autosync on
    pushes the Pro surfaces (knowledge, commands, CEP, gotchas, buddy, feedback)
    silently once per day from the background task — offline keeps the day's
    slot open for retry, a Pro gate (402) consumes it quietly (no error spam on
    Free accounts).
  • Hosted Personal Index for Pro (GL #392): lean-ctx sync index push|pull|status syncs the project's retrieval index (BM25 + embeddings)
    across devices — a fresh machine gets working ctx_semantic_search without
    a local re-index. Bundles are encrypted client-side (XChaCha20-Poly1305;
    key HKDF-derived from the account API key, which the backend stores only as
    a hash): the server holds ciphertext it cannot read. Per-account quota from
    the plan's hosted_index_mb (Pro: 1 GB; open self-hosted deployments:
    1 GB default), display-first — an over-quota push warns and blocks, it
    never bills. New backend routes PUT/GET/DELETE /api/sync/index/{project}
    • GET /api/sync/index; the Personal-Cloud dashboard payload gains a
      hosted_index block (projects, used bytes, quota). The local index is
      never gated (Local-Free Invariant; tests/local_free_invariant.rs).
      Contract: docs/contracts/hosted-personal-index-v1.md.
  • Hosted-index SLO gate (GL #391): the team server now measures every
    /v1 request in an outermost middleware and derives the three GA-gate
    signals — rolling p50/p95/p99 latency, availability (non-5xx share over the
    last 4096 requests) and index freshness (seconds since the last successful
    Index-scoped tool call). Exposed via /v1/metrics (new slo block) and
    /v1/metrics?format=prometheus (leanctx_team_* series for Datadog/
    Prometheus scrape agents). New CLI: lean-ctx team slo-report --server <url> --token <token> [--json] renders the gate and exits non-zero on
    violation (CI-friendly). SLO definitions ship in
    docs/examples/team-slos.toml; the SLO engine understands the new metrics
    team_query_p95_ms, team_availability_pct, team_index_lag_seconds.
    Runbook: docs/guides/hosted-index-slo.md.
  • Accuracy conformance checks for lossy read modes (P1, GL #441):
    lean-ctx conformance now verifies structural invariants of map,
    signatures, aggressive and entropy against a fixed Rust fixture —
    determinism, symbol retention, body stripping, and real compression. CI
    gates on regressions in the modes agents rely on for correctness.
  • Honest metering on phase-isolated / non-caching workloads (#361): lean-ctx gain now states its denominator — savings are compression on
    lean-ctx-touched traffic, not the full provider bill — via a Methodology
    line and a new injected_overhead_tokens_per_turn field in gain --json
    (net bill impact = tokens_saved − injected_overhead_tokens_per_turn × turns).
    New core::context_overhead measures the fixed per-turn prefix lean-ctx
    injects (tool schemas + server instructions + rules block). A new
    rules_injection = "off" (also none/disabled) writes no rules file —
    for hosts that supply their own steering, or phase-isolated/non-caching
    harnesses where the injected prefix is pure re-billed overhead. The
    performance-tuning journey gains a "workload fit" section documenting the proxy
    as the way to reach tool output the ctx_* tools can't wrap. Prompted by an
    independent, reproducible external benchmark.
  • Team RBAC roles (Commercial Plane, EPIC 13.2): a TeamRole
    (viewer/member/admin/owner) layer over the existing fine-grained
    TeamScopes. A token's effective scopes are scopes ∪ role.scopes(), enforced
    by the unchanged team middleware (zero new enforcement paths). Roles are
    monotonic (viewer ⊆ member ⊆ admin = owner). New CLI:
    lean-ctx team token create --role <role> (still supports --scopes, or both).
    Additive / Team-Cloud only — never gates local. SSO/SCIM, org-shared knowledge
    graph, and audit-retention dashboards build on this and remain tracked on the
    commercial plane. Contract updated: docs/contracts/team-server-contract-v1.md.
  • Billing plane: real plans + usage metering (Commercial Plane, EPIC 13.6):
    new core::billing turns the upgrade flow into real plans (free/team/
    enterprise) with explicit Entitlements, plus usage-based metering derived
    read-only from the Ed25519-signed savings ledger (EPIC 12.20). Usage is
    privacy-preserving and only billable on a signed + intact chain. Crucially,
    entitlement_allows upholds the Local-Free Invariant — every local feature is
    allowed on every plan (incl. Free); the local binary has no entitlement
    checks
    (enforced by tests/local_free_invariant.rs). New CLI:
    lean-ctx billing <plans|entitlements|usage> [--json] (informational only).
    Quota semantics disambiguated: 0 = none, UNBOUNDED = unlimited. Checkout/
    provisioning are documented as a hosted control-plane concern (no fakes).
    Contract: docs/contracts/billing-plane-v1.md.
  • WASM extension runtime (Context OS, EPIC 12.8 + 12.10): a sandboxed,
    language-independent way to contribute compressors and context providers
    as plain .wasm modules — no recompile of lean-ctx. Behind the off-by-default
    wasm Cargo feature (features.wasm_runtime in /v1/capabilities), upholding
    the Local-Free Invariant (free, compile-optional). Uniform ABI v1 (memory +
    alloc(i32)->i32 + entry(i32,i32,i32)->i64 packed ptr/len); guests run
    against an empty linker (no syscalls/network/fs/clock — sandboxed by the
    runtime itself) with a fresh Store per call for thread-safety + determinism.
    WasmCompressor registers as a first-class compressor (host-enforced byte
    budget, graceful fallback on traps, conformance-checked); WasmProvider
    registers as a first-class ContextProvider (lenient result-JSON mapping).
    Opt-in discovery from LEAN_CTX_WASM_DIR (*.wasm compressors; *.wasm +
    <stem>.provider.json sidecar providers). Contract: docs/contracts/wasm-abi-v1.md.
  • Context OS guide + non-coding cookbook (Context OS, EPIC 12.18): docs/context-os/guide.md maps the whole platform — principles (Local-Free Invariant), architecture, capability discovery, the four ways to build your own tool (SDK / plugin tool / hook / extension), ingestion+extractors+personas, the savings→ROI substrate, and the plane model. docs/context-os/cookbook-non-coding.md adds four runnable, verified recipes (lead-gen, research, support, data-analysis) plus a custom-vertical template, all using real personas/extractors/SDKs/adapters.
  • Framework adapters (Context OS, EPIC 12.6): leanctx.adapters exposes the lean-ctx tool surface to popular agent frameworks — OpenAI function calling (to_openai_tools / run_openai_tool_call, a pure transform with no extra dep), LangChain (to_langchain_tools), LlamaIndex (to_llamaindex_tools), and CrewAI (to_crewai_tools). Each framework is an optional, lazily-imported dependency (leanctx[langchain|llamaindex|crewai]); all adapters share one tool normalizer and the same call_tool_text path so they behave identically. Tested with/without each framework installed.
  • Python SDK (Context OS, EPIC 12.4): new leanctx package (clients/python/) — a thin, standard-library-only client (urllib, zero runtime deps) for the HTTP /v1 contract, mirroring the TS/Rust SDKs: health, manifest, capabilities, openapi, list_tools, call_tool/call_tool_text, and subscribe_events (SSE). Structured errors (LeanCtxConfigError/TransportError/HTTPError) and the shared run_conformance kit (lockstep with the TS SDK). Ships a README, pyproject.toml, in-process HTTP-server tests, and a python-sdk CI job.
  • TypeScript SDK GA + shared conformance kit (Context OS, EPIC 12.5): @leanctx/sdk gains capabilities() and openapi() for full /v1 discovery parity, a typed CapabilitiesV1, and a new runConformance(client) kit that returns a client-side scorecard (health, capabilities shape, OpenAPI shape, tools listing). The kit mirrors the server-side lean-ctx conformance and is kept in lockstep with the Python SDK so every client proves the same contract. Adds a README and tests.
  • Non-code compression tuning (Context OS, EPIC 12.14): two new compressors tuned for non-code corpora, registered in extension-registry-v1prose (collapse blank-line runs, strip/collapse intra-line whitespace, drop adjacent duplicate lines) and markdown (everything prose does plus strip HTML comments, drop image/badge syntax, and rewrite [text](url) links to their visible text). Both are deterministic and honor a hard byte budget (conformance-checked). Non-coding personas now default to them: researchmarkdown; lead-gen/supportprose.
  • Format extractors & chunkers (Context OS, EPIC 12.13): new core::extractors (extractors-v1) turns non-code documents/data into clean LLM text + structure-aware chunks. JSON (per array element / object entry), CSV/TSV (RFC-4180-aware, header-prefixed row groups), EML (salient headers + body, text/plain from multipart), HTML (rendered Markdown paragraphs, reusing web::html_to_text), and PDF (reusing web::pdf) — with a verbatim paragraph fallback for plain text. Every extractor is total/graceful (never panics, non-empty input always yields ≥1 non-empty chunk, deterministic). The text chunkers (csv/json/eml/html) register into extension-registry-v1, so they surface in /v1/capabilities and are conformance-checked.
  • Conformance & reproducibility scorecard (Context OS, EPIC 12.17): new core::conformance + lean-ctx conformance [--json] produce a Scorecard proving an instance honors its own contracts. Checks span three categories — contracts (all machine-verified versions present), reproducibility (/v1/capabilities and /v1/openapi.json are byte-deterministic), and extensions (every registered compressor/chunker/read-mode satisfies determinism, byte-budget, UTF-8, and coverage invariants — built-in and extension-provided). Exits non-zero on failure; gated in CI via tests/conformance_suite.rs. Contract: conformance-v1.
  • Extension trust & sandbox model (Context OS, EPIC 12.3): every plugin subprocess (hooks + manifest tools) now runs under a SandboxPolicy derived from a new [trust] manifest section (extension-trust-v1). Least privilege by default: the child runs with a scrubbed environment (fixed allowlist — host secrets in env never leak) and a working-directory jail, on top of the existing per-call timeout. Plugins declare capabilities (network, fs_write = consent surface, surfaced in /v1/capabilities; env_passthrough = opt out of env scrubbing). Unknown permissions are a fail-closed manifest error. Declared permissions appear per plugin under extensions.plugins[].permissions.
  • ROI / metering substrate (Context OS, EPIC 12.20): new core::savings_ledger::roi derives a RoiReport strictly from the signed savings batch (BatchTotals + committed chain head + Ed25519 signature) — adding derived metering metrics (net tokens, USD, per-event averages, top models/tools) and provenance (chain_valid, signed, signer key). This is the minimal, privacy-preserving aggregate the Cloud plane meters on: no raw events, paths, prompts, or code — only numbers and hashes — and it is read-only w.r.t. the local ledger. Exposed via lean-ctx savings roi [--json].
  • Plane separation + Local-Free-Invariant CI gate (Context OS, EPIC 12.19): the Personal (local) plane is now a documented, machine-checked boundary. core::server_capabilities classifies every feature flag as LOCAL_ALWAYS_ON, LOCAL_OPTIONAL (compile-only), or COMMERCIAL_PLANE (additive team/cloud). A CI conformance test (tests/local_free_invariant.rs) fails the build if the default plane isn't personal, any local capability isn't unconditionally free, the planes overlap, or a local capability reacts to a LEAN_CTX_LICENSE/LEAN_CTX_PLAN/LEAN_CTX_ACCOUNT env var; a unit test fails if a new feature flag is added without classification. Contract: docs/contracts/local-free-invariant-v1.md.
  • Built-in personas + persona-aware intent/terse (Context OS, EPIC 12.16): ships four non-coding presets alongside codingresearch, lead-gen (alias sales), support, data-analysis — each with its own tool surface, read-mode/compressor/chunker defaults, intent taxonomy, and sensitivity floor. The terse agent prompt is now persona-parametrized: non-coding personas append a domain vocabulary block + their intent list, while the coding persona leaves the prompt byte-for-byte unchanged (no regression). Available presets surface under presets in GET /v1/capabilities.
  • Context persona model (Context OS, EPIC 12.15): new core::persona (persona-spec-v1) — a declarative bundle that shapes the entire context surface for a domain (tool surface, default read-mode, compressor/chunker, intent taxonomy, sensitivity floor), not just coding. Personas are selectable via LEAN_CTX_PERSONA or persona = "…" in config, resolved against built-in presets then <personas_dir>/<name>.toml (override LEAN_CTX_PERSONAS_DIR). The built-in coding persona reproduces today's defaults — the tool surface still resolves to power when nothing is pinned (no regression), and explicit tool-profile settings always win. The active persona surfaces at GET /v1/capabilities under server.persona, with available presets under presets. Contract: docs/contracts/persona-spec-v1.md.
  • Native tool registration without forking (Context OS, EPIC 12.11): a plugin can declare [[tools]] in its manifest (name, description, command, timeout, JSON input schema). Enabled plugins' tools are discovered (PluginManager::tool_specs), adapted into native MCP tools (registered::plugin_tool::PluginTool), and registered dynamically in build_registry() — no fork, no code edit. They surface in GET /v1/capabilities under extensions.tools and in the agent's tool list, and run sandboxed through the shared subprocess runner (piped stdio, LEAN_CTX_PLUGIN_DIR/LEAN_CTX_TOOL env, bounded per-tool timeout). A plugin tool whose name collides with a native tool is skipped (native wins, so a plugin can never shadow core behavior). The hook executor and tool invocation now share one run_subprocess runner; an end-to-end test proves discover → register → invoke.
  • Pluggable read-modes / compressors / chunkers (Context OS, EPIC 12.9): new core::extension_registry (extension-registry-v1) exposes stable, object-safe traits — ReadMode, Compressor, Chunker — backed by a process-global registry seeded with real built-ins (full read-mode; identity/whitespace compressors; lines/paragraph chunkers) registered through the exact same public API extensions use (no special-casing). Extensions register custom transforms by name; the live registry contents now surface under extensions.{read_modes,compressors,chunkers} in GET /v1/capabilities.
  • Generic ingestion front-door (Context OS, EPIC 12.12): intake is no longer gated by is_code_file. A new core::ingestion front-door (ingestion-spec-v1) classifies every path by content kindCode / Document / Data / Text / Binary — via an extension fast-path plus a bounded binary sniff (NUL/control-byte ratio over the first 8 KB). So any text corpus (markdown, csv, json, yaml, html, email, logs, transcripts, even unknown-but-textual files) now reaches BM25/semantic/knowledge — not just source code. Genuine binaries (images, media, archives, compiled artifacts, and binary documents like PDF/DOCX whose extractors arrive in 12.13) are excluded. The duplicate is_code_file in the CLI indexer is removed; bm25_index::is_code_file remains the single canonical code detector, now one input to the front-door. Code repositories are fully backward-compatible — everything that indexed before still indexes.
  • lean-ctx-client Rust crate — the embedding boundary (Context OS, EPIC 12.2): a thin, stable HTTP client for the /v1 contract so any program (an agent harness, a lead-gen worker, a research bot) can integrate lean-ctx over the process boundary without linking the engine. It is the Rust counterpart of the TypeScript SDK (cookbook/sdk) and speaks the same versioned contract: health, manifest, capabilities, openapi.json, paginated tools, tools/call (raw result + flattened text), and events as a blocking SSE iterator. Open-ended documents are returned as serde_json::Value so new server keys never break a client build; errors carry the stable error_code (not the human message) for branching. The crate is deliberately decoupled — it does not depend on the engine crate, re-exports no internals, and documents its non-goals (full-crate linking stays unsupported; integration = process boundary). One small dependency (ureq), blocking by design, #![forbid(unsafe_code)], and covered by a dedicated CI job (fmt + clippy -D warnings + tests against a real localhost HTTP server + docs). Lives at clients/rust/lean-ctx-client.
  • Plugin hooks are now live in the core pipeline (Context OS, EPIC 12.7): the plugin seam that previously only existed in PluginManager is wired into the running server, so a third-party plugin can finally observe the engine without forking it. pre_read and post_compress fire around the central ctx_read choke point (carrying the path and the realized original → compressed token counts), and on_session_start fires once per server process (stdio + HTTP + daemon). Every firing goes through a zero-cost guard (PluginManager::has_listener / notify): with no plugin declaring a hook — the default — the hot read path allocates nothing and spawns no thread, so users without plugins pay exactly zero. Hooks run in the background with per-plugin error isolation and a per-hook timeout (a failing or slow plugin can never block or corrupt a read). The registry is initialized exactly once per process via the existing idempotent init(). An end-to-end test proves a real ctx_read triggers an installed plugin's pre_read hook, and a new LEAN_CTX_PLUGINS_DIR override lets containers/CI/tests point the registry at an isolated plugins root (distinct from the per-hook LEAN_CTX_PLUGIN_DIR the executor exports to a plugin's own child process). All five hook points are now live: on_session_start/on_session_end bracket each server process (the end hook fires synchronously at shutdown so it always runs before exit), pre_read/post_compress wrap reads, and on_knowledge_update fires when ctx_knowledge(action="remember") writes a fact (carrying category:key).
  • OpenAPI spec — GET /v1/openapi.json (Context OS, EPIC 12.1): the public /v1 surface is now described by an OpenAPI 3.0.3 document generated from a single in-code endpoint inventory (core::openapi), so SDK/codegen tooling in any language can consume it. A drift test (openapi_contract_up_to_date) binds the inventory to the Endpoints table in http-mcp-contract-v1.md, so a new public route must update both — code and docs can't diverge. Internal/experimental routes (agent registry, A2A, .well-known, shutdown) are intentionally excluded from the published spec.
  • Capabilities discovery — GET /v1/capabilities (Context OS, EPIC 12.1): a runtime discovery document so any client — in any language — can learn what a lean-ctx instance supports and branch on real features instead of making trial calls. Reports the contract version, server name/version, deployment plane (personal/team/cloud), wire transports, built-in presets (personas), read_modes, the tools surface, a features map (always-on capabilities plus compiled Cargo features like semantic_search/team_server/cloud_server), runtime-discovered extensions (plugins), and all machine-verified contracts versions in one place — no secrets ever included. Versioned by capabilities-contract-v1 (CAPABILITIES_CONTRACT_VERSION); the documented key set is bound to the code SSOT (core::server_capabilities) by a drift test, and a formal /v1 deprecation policy is documented alongside the contract.
  • MCP Tool-Catalog Gateway — ctx_tools (the answer to "more tools → less adoption"): lean-ctx can now sit in front of any number of downstream MCP servers and expose them through a single meta-tool instead of injecting every downstream schema into the system prompt. The agent calls ctx_tools find with a natural-language need; the gateway aggregates the downstream catalogs (TTL-cached), ranks them with the same BM25 engine as ctx_search, and returns a top-N ChoiceCard shortlist (server::tool + one-line description + key params). ctx_tools call then proxies the real call to the owning server and returns its (firewall- and sensitivity-filtered) result. Net effect: unlimited downstream tools at roughly constant context cost. Transports: local stdio (spawns the server as a child process) and remote streamable HTTP (with custom headers / bearer auth) — built on the official rmcp client, no bespoke JSON-RPC. Global-only config and off by default ([gateway] / [[gateway.servers]]); spawning downstream processes can never be enabled by an untrusted project. Granular tool surface → 72.
  • Per-item sensitivity policy floor ([sensitivity]): classify every context item as public < internal < confidential < secret (path heuristics + secret/PII detection incl. Luhn-validated cards and ISO-7064 IBANs) and enforce a uniform floor before content ever reaches the model — redact (mask the spans) or drop (withhold the item). Applied uniformly to tool outputs and knowledge facts. Global-only and off by default.
  • Reproducible scorecard — lean-ctx benchmark scorecard: a deterministic, machine-independent report of compression savings, retrieval recall/MRR, and latency over a synthetic, byte-reproducible corpus. The JSON and human output embed a determinism_digest, so two runs of the same code anywhere produce the same fingerprint — the artifact is self-verifying. Wired into CI as an uploaded artifact.

Changed

  • Parallel dashboard tracks consolidated (GL #476–#479, #486, #490): the
    four-jobs IA from the redesign epic and the incremental UX/data passes that
    shipped in parallel now live on one branch. The epic layout wins (slim Home,
    Proof group with ROI & Plan + Trends, Simple = Home only); the data passes
    win correctness and language — relative search scores (top hit = 100%),
    the verified-bridge line in the Home hero (estimated ⇄ signed ledger),
    Context Triage / Context Contents / Episodes labels, estimate-methodology
    tooltips, per-task episode metrics, the dead Symbols signature column
    removed and vendor noise filtered from the Compression Lab. Search keeps
    the inline ±12-line preview and gains an "Open in Lab →" handoff. On the
    Rust side ctx_search now returns a SearchOutcome that separates the
    modeled native-grep baseline (estimated stats) from raw observed tokens
    (verified ledger), so the two series can never cross-contaminate.
  • Four-jobs cockpit navigation + slim Home (GL #470/#486, phase 1): the
    sidebar now tells the same story as the website — Context (decides what
    agents read)
    , Memory (remembers what agents learn), Proof (proves what
    you save)
    and Project Map (understands your codebase) — instead of 17
    flat entries. Simple mode is the 5-second answer: Home only. Home itself
    slimmed down to status strip + receipt + gauge/triage + one trend + top-3
    commands (expandable); the cost-analysis card moved to ROI & Plan (labelled
    as the estimated, all-time view next to the verified-ledger methodology)
    and the MCP-vs-shell / task-breakdown doughnuts moved to Trends. Every view
    stays reachable via Advanced mode, deep links and the command palette.
  • Large modules split by domain (P1, GL #439, #440):
    cli/dispatch/analytics.rs (1685 LOC) → analytics/{gain,savings,billing,graph},
    core/stats/format.rs (1532) → format/{util,cep,dashboard,views},
    rules_inject.rs (1542) → rules_inject/{content,targets,detect,write,skills}.
    No behavior change; entry-point visibility narrowed to the dispatch layer.

Fixed

  • Scorecard determinism restored (#211 contract): benchmark entropy
    numbers fed the scorecard's reproducibility digest through the regular
    compression path, whose opportunistic semantic redundancy filter (#544)
    kicks in as soon as the shared embedding engine finishes loading — two
    runs in the same process could disagree (e.g. entropy=0.00 vs 57.29
    on the small corpus). Benchmarks now pin the filter off via the new
    entropy_compress_deterministic, keeping the digest machine-independent
    (and cutting the determinism test from 25 min to 4 s).
  • Signed artifacts always embed the key that actually signed them: every
    signer that embeds its public key next to the signature (handoff transfer
    bundles, evidence bundles, wrapped publish) previously resolved the
    keypair twice — once to sign, once to read the public key. If the key store
    moved or the key was regenerated between the two reads (concurrent
    data-dir changes, parallel processes), the artifact carried a public key
    that could never verify its own signature. New atomic
    agent_identity::sign_with_public_key / sign_bytes_with APIs resolve the
    keypair exactly once; all three call sites migrated.
    (pipeline red since the #551 efficiency program landed): the
    try_shared_engine_returns_none_when_not_initialized unit test asserted
    on the process-global SHARED_ENGINE OnceLock while the new #551
    background activation (triggered by any sibling test touching entropy
    compression) could load — and in CI even download — the model
    mid-suite. The test now lives in its own integration-test binary
    (tests/embeddings_shared_engine.rs, fresh process = deterministic),
    ensure_engine_background() is a no-op under cfg!(test), and CI
    exports LEAN_CTX_EMBEDDINGS_AUTO_DOWNLOAD=0 so the suite is hermetic.
    Also un-sticks the Coverage job: the silent engine load made
    run_project_benchmark("src") exceed tarpaulin's 180 s timeout.
  • ctx_shell/ctx_execute failures now set MCP isError +
    structuredContent
    (GitHub #389): every tool call returned
    CallToolResult::success regardless of the shell exit code — MCP clients
    (OpenCode guards, Claude Code, Cursor) had no programmatic way to detect
    failures and were forced to regex-parse the [exit:N] text footer. A new
    ShellOutcome (Exit(code) | Blocked) now flows from the shell tools
    through dispatch into the MCP result: non-zero exit sets
    isError: true + structuredContent: {"exitCode": N}, allowlist/
    validation rejections set isError: true + {"blocked": true}.
    Covered end-to-end: the degraded session-lock path (which previously
    even dropped the exit footer), the auto-checkpoint early return, the
    reference-store substitution, ctx_call chaining, and ctx_execute
    (single/batch — first failing task fails the batch — and file
    preconditions). Exit 0 stays byte-identical (no metadata churn).
  • OpenClaw: setup --auto re-injected the legacy mcpServers key and
    broke 2026.6.1+ hot-reload
    (GitHub #390): OpenClaw moved to a nested
    mcp.servers schema with strict validation; the editor-registry writer
    still wrote top-level camelCase mcpServers, so every watchdog tick
    produced config reload skipped (invalid config): Unrecognized key
    with gateway-down risk on restart if the stale block won. OpenClaw now
    has a dedicated ConfigType::OpenClaw writer: it detects the version
    via meta.lastTouchedVersion (>= 2026.6.1 or an existing mcp.servers
    block → nested schema; older → legacy camelCase), migrates our stale
    mcpServers.lean-ctx entry away (dropping the key when empty, foreign
    entries preserved), and is strictly idempotent — watchdog re-runs leave
    the file byte-identical (verified via mtime). init --agent openclaw,
    setup --auto, lean-ctx doctor (flags stale legacy blocks) and both
    uninstall paths (editor-registry + textual lean-ctx uninstall, which
    now also strips an emptied mcpServers {} leftover) share the same
    schema logic. Invalid JSON is never text-injected for openclaw.json —
    a malformed write would take the gateway down.
  • Shell parser: >| noclobber redirect treated as a pipe (GitHub #387):
    date --fsdfs >| out 2>&1 split at the |, so the redirect target
    (out) was checked against the shell allowlist as a command and
    blocked. The segment splitter now recognises >| as a redirect
    operator; file-write targets are never allowlist-checked.
  • gain --deep crash on multibyte paths/agent ids (GitHub #386):
    every display truncation helper (ctx_gain::truncate_str /
    shorten_path, stats::format::truncate_cmd, ctx_architecture
    hotspot paths) sliced at byte offsets and panicked mid-codepoint for
    umlauts/CJK/emoji; one helper could also underflow for tiny widths.
    All cuts are now char-boundary-safe (swept 0..=len+2 in tests).
  • report-issue now embeds the crash log (GitHub #386 follow-up):
    the last 3 entries of <data_dir>/logs/crash.log (location, payload,
    truncated backtrace) ship with every report, so panic reports are
    actionable instead of arriving empty.
  • SIGABRT coredumps from the panic hook itself (GitHub #378): the
    process-wide panic hook used eprintln!, which panics on I/O errors —
    when a background worker's stderr was gone (terminal closed → EPIPE),
    any ordinary panic became a double panic and the runtime aborted the
    whole process (38 coredumps reported). The hook now writes its message
    best-effort (write_all, errors ignored) and wraps the crash-log write
    in catch_unwind; a panic can never escalate to SIGABRT through the
    hook anymore.
  • MCP token footprint: installers no longer force the full toolset
    (GitHub #385): every generated MCP config carried
    LEAN_CTX_FULL_TOOLS=1, advertising 69+ tool schemas (~15k tokens)
    to the client on every turn — lean-ctx showed up as one of the biggest
    token consumers in users' own usage breakdowns. New installs/refreshes
    now use the core toolset (13 tools + ctx_call/ctx_expand for
    on-demand access); opt back in via tool_profile = "power" in
    config.toml or LEAN_CTX_FULL_TOOLS=1 in the server env.
  • Pi: stale ~/.pi/agent/mcp.json entry defeated the embedded bridge
    (GitHub #361, found by the tokbench independent benchmark): Pi has no
    native MCP adapter, but init --agent pi wrote a lean-ctx mcp.json
    entry that older pi-lean-ctx versions read as "adapter configured" and
    disabled their embedded MCP bridge — the session cache silently never
    engaged. The installer no longer writes that entry anywhere
    (hooks path + editor-registry target + setup target all removed) and
    init --agent pi migrates existing configs by deleting the stale
    entry (file removed entirely when lean-ctx was its only content).
  • Uninstall: perfect-clean guarantee (GL #558, Discord report):
    lean-ctx uninstall now leaves zero artifacts behind. Backup sweep
    covers installer subdirectories (hooks/, rules/, skills/,
    steering/, VS Code User/, .gemini/antigravity-cli) and
    project-local CWD config dirs; lean-ctx-owned script backups and
    orphaned config backups are removed; {"hooks": {}, "version": 1}
    boilerplate shells are deleted instead of kept; now-empty installer
    directories are swept as the final filesystem step (non-empty dirs
    survive untouched); platform data dirs (~/Library/Application Support/lean-ctx, %LOCALAPPDATA%\lean-ctx, ~/.local/share/lean-ctx)
    are removed. Verified end-to-end: 8-agent install + proxy enable →
    uninstall → 0 lean-ctx references, 0 .bak files, 0 leftover dirs.
  • Claude rules file regression (GL #555 follow-up, GL #558):
    rules_inject still wrote the always-loaded
    ~/.claude/rules/lean-ctx.md on init --agent claude, undoing the
    token-footprint fix. Claude Code no longer gets a rules target — the
    CLAUDE.md block + on-demand skill carry the guidance.
  • Setup .bak churn (GL #558): re-running setup/init no longer
    rewrites identical hook scripts, so no backup files pile up for
    unchanged content.
  • Audit chain forked under concurrent processes (found via GL #425
    E2E): prev_hash came from a per-process cache, so two processes
    appending simultaneously both chained onto the same parent (and could
    interleave half-written lines). record() now takes an exclusive
    advisory file lock and reads the chain tail from the file itself;
    regression test runs 4 concurrent writers and demands one valid
    100-entry chain. The evidence generator additionally splits historic
    glued lines losslessly and refuses unparseable data inside an attested
    period.
  • Claude Code: instruction footprint cut from ~12k to <500 tokens
    (GL #555): the ~/.claude/CLAUDE.md block imported the full ruleset via
    @rules/lean-ctx.md and the project AGENTS.md block via @LEAN-CTX.md.
    Claude Code expands @-imports inline at launch and loads every rules
    file without paths: frontmatter unconditionally — stacking the same
    ruleset up to three times per session (field reports: 12.3k tokens of
    memory files before the first message). The CLAUDE.md block is now
    self-contained (v3, no imports), the AGENTS.md block carries a 3-line
    inline mapping with a plain-text pointer, and the always-loaded
    lean-ctx-owned rules files (~/.claude/rules/lean-ctx.md, project
    .claude/rules/lean-ctx.md) are removed on update (marker-checked) —
    deep documentation lives in the on-demand lean-ctx skill.
  • Claude Code: compactions now actually reset the re-read cache
    (GL #555): every Claude hook payload carries session_id, so the generic
    session catch-all matched before the compaction check —
    hook_event_name: "PreCompact" was never recorded and
    sync_if_compacted() never reset full_content_delivered flags. After a
    host compaction, ctx_read kept answering [unchanged] stubs that
    pointed at evicted context, and agents recovered by switching to native
    Read for the rest of the session. PreCompact is now detected ahead of
    the catch-all (regression-tested with the real payload shape), so the
    first re-read after compaction delivers full content again.
  • Tool schemas hardened for strict validators (GL #545): 20 tool
    schemas (incl. ctx_expand) declared type: object + properties
    without an explicit required array — valid JSON Schema, but strict
    Pydantic-based backends (OpenAI, Azure, SGLang) reject it and OpenCode
    surfaces Invalid schema for function 'lean-ctx_ctx_expand': None is not of type 'array'. Every advertised schema (built-ins and plugin
    manifests) now passes normalize_for_strict_validators(): recursive
    explicit required: [] on object schemas and items on array schemas,
    at every nesting level. Regression gate:
    rust/tests/tool_schema_strictness.rs walks the whole registry.
  • Windows: proxy/daemon survive AI-client MCP recycling (GL #545):
    the auto-started proxy and daemon were spawned as plain child processes.
    On Windows they inherit the parent's console and Job object; AI clients
    (OpenCode, Codex, Claude Code) run MCP servers inside kill-on-close Jobs,
    so recycling the MCP process silently killed the proxy mid-flight —
    observed as Cannot connect to API: The socket connection was closed unexpectedly, cold-start latency and agents falling back to native
    tools. Background spawns now use ipc::process::spawn_detached()
    (DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP | CREATE_BREAKAWAY_FROM_JOB, graceful fallback when the Job denies
    breakaway). No behaviour change on macOS/Linux.
  • Proxy history pruning defeated provider prompt caching (GL #534): the
    Anthropic/OpenAI proxy handlers summarized everything older than the last 6
    messages on every request. That rolling boundary rewrote a
    previously-stable message each turn, so the provider's prefix-matching
    prompt cache (Anthropic cache_control, OpenAI automatic caching) missed
    from that point on — users saw uncached input jump from ~2–10k to 80–100k+
    tokens per turn (cache writes at 1.25× instead of reads at 0.1×). History
    is now pruned at a frozen, cache-aware compaction boundary that only
    advances in deterministic 16-message strides (≥8 recent messages always
    intact): between jumps the request prefix stays byte-identical and the
    prompt cache keeps hitting; a jump costs one re-write, then caching resumes
    on the smaller history. Pruning is content-deterministic and preserves
    cache_control breakpoints; tool-result compression is prefix-stable and
    unchanged. New [proxy].history_mode config key /
    LEAN_CTX_PROXY_HISTORY_MODE env: cache-aware (default), rolling
    (legacy max-savings), off. Invariant locked by a byte-stability test
    simulating 80 growing turns.
  • ctx_edit evidence diff corrupted by terse post-processing (GH #382):
    the evidence (diff) block embeds verbatim source lines, but the generic
    terse stage still ran over ctx_edit output — dictionary abbreviation
    (return 0ret 0), blank-line stripping and line-score filtering
    silently dropped/mangled diff lines, making agents conclude a correct edit
    went wrong (the file on disk was always right). Two-layer fix: ctx_edit
    joins the read family in the terse exemption, and the terse pipeline itself
    is now fence-aware — content inside ``` / ~~~ fences passes through
    byte-exact while surrounding prose still compresses, protecting every
    current and future tool that embeds code blocks.
  • CI green again across all three OS runners: the billing-catalog golden
    fixture now normalizes CRLF before comparing (Windows autocrlf checkouts),
    the path_resolve CWD-independence test canonicalizes both sides before
    comparing (macOS /var symlink, Windows 8.3 short names), the
    team_billing module doc no longer intra-doc-links a private const
    (rustdoc -D warnings), and the six new org/cloud contract docs are
    classified (Experimental) in contract_docs(). The frozen
    team-server-contract-v1.md is restored byte-exact; its additive
    storageQuotaBytes/roiWebhookUrl keys moved to a new
    team-server-contract-v2.md (Stable), per the contract-file rule.
    Second wave: the CLI fidelity/pipe-guard integration tests pin
    LEAN_CTX_ALLOWLIST_WARN_ONLY=1 (they assert compression behavior, not
    enforcement — on CI stderr is no TTY, so the new agent-mode allowlist
    blocked their for/while test scripts with exit 126), the
    ISSUER_CACHE/ATTEMPTS statics are documented in LOCK_ORDERING.md
    (L45/L46), and docs/reference/generated/mcp-tools.md is regenerated
    for the ctx_agent brief/return actions and ctx_knowledge as_of.
  • Cockpit backlog triple (GL #454, #455, #456): the Routes view now
    understands axum — .route("/path", get(handler)) incl. chained methods
    (get(a).post(b)), qualified forms (axum::routing::post) and module-path
    handlers — plus hand-rolled "/api/…" => match routers, taking this
    codebase from 0 to 136 detected routes. The Call Graph starts framed: an
    initial zoom-to-fit runs once the force layout settles (manual pan/zoom is
    never overridden) and link opacity now fades with edge density, so 150-node
    graphs stop rendering as an over-zoomed hairball. And when token auth is on
    but the browser has none, the first 401 swaps the page for a single
    centered token prompt (validates against /api/health, stores in
    sessionStorage, reloads) instead of two dozen raw unauthorized cards.
  • Dashboard polish from the function audit (GL #478): the Explorer tree is
    now a real WAI-ARIA tree — role=tree/treeitem/group, aria-expanded,
    roving tabindex and full keyboard support (arrows expand/collapse/navigate,
    Enter/Space toggle, Home/End jump) with a visible focus ring. Search results
    stopped pretending: clicking a hit opens an inline file preview (±12 lines
    around the match, hit line highlighted) served by the existing
    compression-demo endpoint, with full keyboard access. Procedures now
    auto-learn: every recorded episode re-runs workflow detection
    (procedural_memory::auto_detect_from_episodes), so recurring tool
    sequences appear on the Memory page without anyone calling detect by
    hand. The status-bar daemon indicator finally explains itself — the tooltip
    describes what green/red means and how to recover (lean-ctx serve -d).
  • Data truthfulness (GL #479): the dashboard now tells the whole story
    behind its savings numbers. The verified ledger covers measured shell and
    search compression (cli_shell, ctx_shell, ctx_search events with raw,
    unmultiplied baselines) instead of only ctx_read — closing the unexplained
    24x gap between Home and the ROI view. The 2.5x native-grep counterfactual
    used by the estimated stats is now a documented, named constant
    (NATIVE_SEARCH_BASELINE_FACTOR), surfaced in the Home tooltips and in a
    new "Methodology: verified vs. estimated" card on the ROI view. Inferred
    agent activity no longer shows negative ages on UTC+N machines (event
    timestamps are local wall-clock and are now interpreted as such).
  • No more WARN noise when scanning project subdirectories (P1, GL #438):
    graph_index now walks ancestors for project markers, so repo/rust/src
    inside ~/Documents is a legitimate scan root (the .git lives two levels
    up). Marker-less trees under blocked home dirs stay refused.
  • Windows symlink parity at every security boundary (P1, GL #442):
    pathjail, ctx_edit, config_io and read_file_nofollow now reject NTFS
    junctions and all other reparse points (not just symlinks) via the shared
    pathutil::is_symlink_or_reparse check; non-Unix read_file_nofollow
    previously followed links without any check.
  • Stale cache stubs can no longer mislead the agent (P0-7, GL #419):
    staleness now treats any mtime change as stale (backward mtimes from
    git checkout previously read as fresh) and verifies the content hash before
    serving an [unchanged] stub when the mtime claims no change (same-second
    writes, restored timestamps). Opt out: LEAN_CTX_CACHE_VERIFY=0.
  • Panics are now diagnosable after the fact (P0-8, GL #420, upstream #378):
    every panic appends thread, location, payload and backtrace to
    ~/.lean-ctx/logs/crash.log (0o600, size-rotated) — stderr-only reporting was
    lost for daemon/LaunchAgent/MCP-child processes.
  • Copilot CLI hooks work on Windows (#381): the generated hook entries
    carried only a bash command — but Copilot CLI runs the powershell field on
    Windows, so the hooks had no runnable command there, errored, and made the CLI
    reject every tool call. Entries now carry both fields, each with a quoted
    binary path (bash gets the MSYS-style conversion; powershell uses the call
    operator — Windows install paths routinely contain spaces). Also, global hooks
    were written to ~/.github/hooks/hooks.json, a location Copilot never reads:
    they now go to the documented user-level ~/.copilot/hooks/hooks.json
    (honoring COPILOT_HOME), existing pre-#381 configs are upgraded in place
    (missing-powershell detection), and lean-ctx entries are migrated out of the
    stale legacy file (deleted when it was ours alone, foreign hooks preserved).
  • Dashboard "ROI & Plan" view is live, not a frozen snapshot (user-reported):
    the view fetched /api/roi exactly once per navigation — the cockpit's 10 s
    poll only refreshed the status footer, and the lctx:refresh event was only
    dispatched by the manual ↻ button. Sitting next to the live-updating footer,
    the static ROI numbers looked broken. The ROI view now re-fetches on the same
    10 s cadence while it is the active view, flicker-free (the "Loading…"
    placeholder renders only before the first payload; background refreshes swap
    content in place, guarded against overlapping fetches) and shows a muted
    "Updated HH:MM:SS · auto-refreshes every 10 s" line so liveness is visible.
    Drive-by: the Commander view's lctx:refresh listener was the only one
    without an active-view guard (and was never removed on disconnect) — it now
    follows the standard guarded pattern.
  • proxy enable no longer breaks Claude Pro/Max subscriptions (community-reported): the proxy forwards the caller's credential upstream but never injects one, so it can only compress Claude traffic in API-key (pay-as-you-go) mode. A Claude Pro/Max subscription authenticates via OAuth directly against api.anthropic.com, and that token is rejected by any custom ANTHROPIC_BASE_URL — so unconditionally pointing ~/.claude/settings.json (and the shell ANTHROPIC_BASE_URL export) at the local proxy produced a login loop / 401 the moment Claude Code started, while OpenAI-compatible backends (Ollama, Codex) kept working. proxy enable now detects whether an Anthropic API key is available (ANTHROPIC_API_KEY/ANTHROPIC_AUTH_TOKEN in the environment, or an apiKeyHelper/key in ~/.claude/settings.json) via anthropic_api_key_available() and, when none is found, skips the Claude redirect (leaving Claude Code on Anthropic directly), omits the ANTHROPIC_BASE_URL shell export (OpenAI/Gemini exports are unaffected), and repairs any pre-existing stale local redirect. It prints a clear explanation and points subscription users to the ctx_* MCP tools for savings; --force overrides for keys stored where we can't probe (e.g. a keychain). lean-ctx doctor gained a check that flags an enabled proxy still routing Claude through the proxy without an API key, with the exact fix (proxy disable, or export a key + re-enable). Documented in docs/reference/05-advanced.md.
  • Shell-output redirected to a file is always byte-faithful — compression never corrupts cmd > out: when compression was forced (the agent shell hook runs lean-ctx -c, and the hook deliberately bypasses its own [ ! -t 1 ] pipe guard for agents), the compressed digest was written into a real file on a redirect — so git status --short > files.txt, git diff > patch.txt, cmd >> log, etc. landed an abbreviated/deduplicated summary instead of the exact bytes, producing contradictory diffs and silently dropped lines for any downstream tool that re-read the file. exec() now detects when stdout is a regular file (fstat/handle metadata via std, no new deps) and passes the output through verbatim even under LEAN_CTX_COMPRESS/-c. This is enforced at the single exec choke point, so it holds for every caller (shell hook, direct CLI, Pi/MCP bridges) and every redirect form; pipes (an agent's captured stdout) and TTYs are unaffected and keep compressing. Regression-tested both ways: a redirect-to-file is byte-identical to the raw command while the same command + env stays compressed when piped.
  • Agent hooks always use an absolute binary path (#367): generated hook commands (Codex, Cursor, Claude, Gemini, Antigravity, …) emitted a bare lean-ctx, which fails with exit 127 when the host runs the hook under a non-login shell whose PATH lacks the install dir. resolve_binary_path() now always resolves to the absolute path (matching MCP setup / doctor); stale bare-command configs are rewritten on the next init / doctor.
  • Proxy forwards the OpenAI-Project header (#366): project-scoped OpenAI keys carry their scope via OpenAI-Project (sent by OpenCode and the OpenAI SDK on the Responses API). The proxy's request-header whitelist dropped it, so the upstream rejected the call with Missing scopes: api.responses.write. openai-project (and openai-organization) are now forwarded verbatim.
  • gemini setup installs the Antigravity CLI plugin hooks (#284): lean-ctx init --agent gemini configured the Antigravity CLI MCP target but never wrote its plugin hooks, so hooks landed only in the legacy ~/.gemini/settings.json that agy ignores. The gemini path now also installs the agy plugin (~/.gemini/config/plugins/lean-ctx); auto-detect already covers the standalone antigravity-cli target.
  • The Antigravity CLI plugin is a self-contained, spec-"compliant" bundle (#284): the agy plugin lean-ctx writes (~/.gemini/config/plugins/lean-ctx/) now ships its own mcp_config.json next to plugin.json + hooks/hooks.json, so the ctx_* tools travel with the plugin and it validates clean under agy plugin validate (✔ mcpServers, ✔ hooks). This was verified against the real agy binary, which stages plugins to exactly this path and shape via agy plugin install — i.e. the reporter's documented ~/.gemini/antigravity-cli/plugins/<name>/ + root hooks.json layout is what the docs say, but agy v1.0.x actually reads ~/.gemini/config/plugins/<name>/ with hooks/hooks.json (the doc's own "global plugins" section agrees). The profile copy (~/.gemini/antigravity-cli/mcp_config.json) is kept for back-compat; agy keys MCP servers by name, so the dual definition is harmless. Root-cause note for the "hooks still not firing" reports: hook execution in agy is gated by its own server-side feature flag enable_json_hooks (a proto field applied via applyFeatureProviderJSONHooksConfig; experiment json-hooks-enabled) and cannot be forced from a local ~/.gemini/config/config.json (verified). lean-ctx therefore installs the hooks in the precise location/format agy expects and they light up automatically once that flag reaches the account — note agy -p print mode bypasses the hook subsystem entirely (hooks run in interactive sessions only). lean-ctx doctor integrations now verifies the full bundle (plugin.json + hooks/hooks.json + plugin-local mcp_config.json) so install and doctor stay in lockstep and doctor --fix repairs any drift.
  • CEP meter counts cache hits and sessions for long-lived servers (#361): cep.sessions and total_cache_hits could stay 0 even with confirmed cache activity — the meter only recorded on an auto_checkpoint that a short workload may never reach, and repeated snapshots within one process dropped the cumulative cache-hit/read delta (only the first snapshot's value was kept). CEP is now recorded on the live-stats cadence (so even brief sessions register) and accumulates per-snapshot deltas, so lean-ctx gain reflects real cache savings.
  • Pi: no envelope overhead on tiny reads (#361): a ctx_read of a very small file appended a "Compressed N → N tokens (0%)" footer even when nothing was saved, making the payload larger than the source. The footer is now suppressed when there is no actual saving (compression stats are still recorded for telemetry); cached re-reads and genuinely compressed reads keep their footer.
  • ctx_smells dead-code no longer flags instantiated classes (#365): added an end-to-end regression test (build graph → scan) confirming imported-and-instantiated Python classes are not reported as dead code while a never-referenced class still is — locking in the symbol-level call/import edges the graph builder creates.
  • ctx_read is byte-faithful — the terse layer no longer mangles file reads (reported via a community A/B code-review evaluation): the server's post-dispatch terse stage (prose dictionary returnret, stringstr, … plus line-score filtering) was skipped for reads only when the read had already saved tokens. A verbatim mode="full" (or lines:) read saves 0 tokens, so it was silently routed through the prose compressor — abbreviating keywords and dropping repeated lines. This violated the full contract ("guaranteed complete content"), corrupted source the agent edits against, and could drop the exact cross-file lines needed for data-flow review. skip_terse now skips the whole read family (ctx_read, ctx_multi_read, ctx_smart_read, ctx_compress, ctx_overview) unconditionally; reads keep only their own mode-aware, structure-preserving compression (map/signatures/aggressive).
  • An explicit read always returns content, never a stored-reference stub (same report): the ephemeral context firewall already exempts file reads, but the opt-in reference_results path did not — enabling it turned a large ctx_read into an [Reference: …] Output stored … preview the agent could not edit against. A single firewall::is_protected_read predicate is now the source of truth for "an explicit read returns content," honoured by both the firewall and the reference-results path, so ctx_read/ctx_multi_read/ctx_smart_read are never stubbed regardless of config.
  • Generated artifacts always reference the running build — autostart / MCP / hooks can't diverge (#2444): resolve_portable_binary() (which backs the daemon + proxy autostart plists, the daemon spawn, the MCP server command, agent + shell hooks, and the update scheduler) resolved which lean-ctx first, so the baked path depended on ambient PATH ordering at generation time. On a machine with both a Homebrew and a ~/.local/bin install this was non-deterministic — the daemon LaunchAgent captured the stale Homebrew copy while the proxy/MCP config captured ~/.local/bin, silently running two different builds at once. The decision is now a pure, unit-tested choose_binary_path() that prefers the currently-running executable (current_exe()), falling back to PATH only when the running binary lives in a transient Cargo build dir (cargo run -- setup, where the installed copy is the intended target). Keeps generated hook commands absolute (#367).
  • MCP server can no longer go dark — every tool handler runs under a watchdog (#271): the recurring TypeError: Cannot read properties of undefined (reading 'invoke') was the client losing its tool handles after the server stopped replying. Root cause: handlers were dispatched via tokio::task::block_in_place, which pins one of the few core async workers and — being synchronous — cannot be interrupted by a tokio::time::timeout on the same task, so a handler that blocked (e.g. the nested block_in_place inside ctx_multi_read exhausting the blocking pool under concurrent reads) silently swallowed the JSON-RPC response. Every handler now runs on the dedicated blocking pool via spawn_blocking, awaited under a watchdog deadline (LEAN_CTX_TOOL_TIMEOUT_SECS, default 120s; ctx_shell/ctx_execute exempt): core workers stay free for the stdio loop and on timeout/panic the server returns a clean error instead of dropping the reply. The specific nested block_in_place in ctx_multi_read is also removed at the source (now bounded_lock + panic guard). Covered by a 16-way concurrency stress test through the full dispatch path plus timeout/panic unit tests.
  • SIGABRT crash in the background indexer — deep ASTs no longer overflow the stack (#378): graph indexing aborted the whole daemon on files with deeply nested syntax (machine-generated source, deep C/C++ headers, long call chains). The release profile is panic = "unwind", so a worker panic can't SIGABRT — the crash was a stack overflow, whose handler calls abort() and which catch_unwind cannot intercept. Every tree-sitter AST walk recursed once per node depth on a ~2 MiB worker stack. New core::ast_walk provides iterative, heap-stack pre-order traversal (for_each_descendant, for_each_descendant_pruned, find_descendant_by_kind) — depth is now bounded by the heap, not the call stack, with identical pre-order semantics; every recursive walk on the indexing path (deep_queries, cyclomatic, swift signature params) was converted. Defense-in-depth: the indexer runs on a named leanctx-index thread with a 16 MiB stack + graceful spawn-failure handling, and ModeGuard::drop is now panic-free (try_borrow_mut) to remove a latent double-panic → abort path. Guarded by 20k-deep and 12k-deep overflow regression tests.
  • ctx_read no longer panics on UTF-8 files with multibyte characters (#379): the structural-hint and shell-result extractors in core::auto_findings truncated labels with raw byte slices (&s[..s.len().min(N)]), so a cut that landed inside a multibyte codepoint (e.g. a Cyrillic #//// comment near byte 70) panicked with "byte index N is not a char boundary" — surfacing to the MCP client as a -32603 error and an empty read. All nine truncation sites now use str::floor_char_boundary, which snaps the cut down to a valid boundary while preserving the byte budget. Guarded by multibyte regression tests across every layer (content hint, failed-command/test-result shell paths, and the dedup key).

Security

  • Dashboard: attribute-safe HTML escaping everywhere (CodeQL #61#65):
    the central LctxFmt.esc used a textContent/innerHTML round-trip that
    escapes &<> but not quotes, and cexpEsc in the explorer did the same —
    a " in a file path, symbol name or knowledge value could break out of
    title="…" / aria-label="…" attributes (DOM XSS). All escape helpers
    (central + every per-component fallback, 35 sites across 15 files) now
    escape & < > " ' via numeric entities; the dangerous identity fallbacks
    (F.esc || String) are gone. Verified by a functional breakout test.
  • CLI shell allowlist is now enforced for agents (P0-1, GL #413):
    lean-ctx -c blocks allowlist violations (exit 126) whenever the caller is
    non-interactive (stderr is not a TTY) or in hook-child mode — the CLI path is
    no longer weaker than the MCP path. Humans at a terminal keep the warn-only
    behavior; LEAN_CTX_ALLOWLIST_WARN_ONLY=1 is the explicit opt-out. The block
    message explains the one-line fix (lean-ctx allow <cmd>).
  • Cloud credentials are written 0o600, atomically (P0-2, GL #414):
    ~/.lean-ctx/cloud/credentials.json is created owner-only (dir 0o700) via
    tmp+rename; pre-existing world-readable files are tightened on load.
  • Deterministic path resolution (P0-3, GL #415): relative tool paths are
    never resolved against the process CWD anymore (daemon CWD ≠ project);
    resolution is strictly project_root → shell_cwd → jail_root.
  • Proxy can no longer start unauthenticated (P0-4, GL #416):
    start_proxy_with_token(None) now auto-resolves the session token instead of
    disabling auth. Provider routes still accept provider API keys, so IDE
    clients need no setup.
  • Postgres provider validates schema identifiers (P0-5, GL #417): the
    agent-controlled schema param is restricted to [A-Za-z_][A-Za-z0-9_$]*
    (max 63 chars) before SQL interpolation — closes an injection vector.
  • ctx_edit rejects symlinks (P0-6, GL #418): reads open with O_NOFOLLOW
    (plus an lstat pre-check on all platforms) and writes refuse symlink
    destinations — closes a TOCTOU window where a link planted inside the jail
    could read or overwrite files outside it.
  • Cloud/infra CLIs removed from the default shell allowlist (P0-9, GL #421):
    terraform, ansible, kubectl, helm, az, aws, gcloud, firebase, heroku, vercel,
    netlify, fly, wrangler, pulumi now require explicit opt-in
    (lean-ctx allow <cmd>) — they mutate remote infrastructure with ambient
    credentials. Dev-essential tools (git, cargo, rm, psql, …) are unchanged.
  • Home-level IDE config dirs are jail-opt-in (P0-10, GL #422): ~/.cursor,
    ~/.claude & co. are no longer automatically reachable through the PathJail
    (they expose foreign projects' sessions, MCP configs and tokens). Opt in via
    allow_ide_config_dirs = true or LEAN_CTX_ALLOW_IDE_DIRS=1; ~/.lean-ctx
    stays allowed.

Upgrade

lean-ctx update                 # recommended (auto-downloads + refreshes shell hooks)
cargo install lean-ctx          # or
npm update -g lean-ctx-bin      # or
brew upgrade lean-ctx

Note: After upgrading via cargo/npm/brew, run lean-ctx setup to refresh shell aliases. lean-ctx update does this automatically.

Full Changelog: v3.8.0...v3.8.0

Don't miss a new lean-ctx release

NewReleases is sending notifications on new releases.