The Web & Research release. lean-ctx reaches beyond the codebase: the new
ctx_url_readtool pulls web pages, PDFs and YouTube videos into context as
compressed, citation-backed text — research, docs and transcripts without
leaving the agent loop. Alongside it ship three field-reported fixes: background
scans never hydrate cloud placeholders (#363), the proxy stops 401-ing
OpenAI-compatible provider keys (#362), and the Pi extension's session cache
finally engages (#361).
Added
ctx_url_read— the web & research layer (the web counterpart ofctx_read): fetch a public web page, PDF, or YouTube video and get back compressed, citation-backed context. HTML pages and PDFs are parsed to clean Markdown/text; a YouTube URL is resolved to its transcript and flattened into compact, quotable text. Seven distillation modes (auto|markdown|text|links|facts|quotes|transcript): thefactsandquotesmodes return discrete claims, each carrying a confidence score and the source URL it came from, so web research is auditable. Extractive, relevance-ranked research-compression distils a whole page down to a token budget (max_tokens, default 6000;max_itemscapsfacts/quotes, default 12), and an optionalqueryfocuses extraction on what you actually need. Fetching is SSRF-guarded — onlyhttp/https, with private, loopback and link-local addresses blocked and revalidated after every redirect. Ships with the binary and is exposed automatically wherever lean-ctx runs as an MCP server (granular tool surface → 69).
Changed
- Pi: the embedded MCP bridge is on by default, and every read is cached through it (#361): the bridge that holds the persistent session cache was opt-in, and even when connected only a plain
ctx_readwas routed through it — line-range reads (offset/limit) and the grep/ls/find tools always spawned a fresh one-shot CLI, so the ~13-token cached re-read essentially never happened on Pi (an independent benchmark measuredcep.sessions: 0even with the bridge connected). The bridge now starts by default (opt out withLEAN_CTX_PI_ENABLE_MCP=0/"enableMcp": false), and allctx_readvariants — includinglines:N-Mranges — route through it with a CLI fallback, so unchanged re-reads are cheap and register as real CEP sessions. The #168 steering ("Prefer over native …") is now also carried by the Pi extension's own tool descriptions, andPI_AGENTS.mdplus the setup output steer agents to thectx_*tools instead of the un-compressed nativeread/bash/grep(which are not routed through lean-ctx in additive mode).
Fixed
- Background scans never hydrate cloud placeholders (OneDrive / iCloud) (#363): starting an agent in — or above — a cloud-synced folder made lean-ctx's directory walks read every file to index it, forcing OneDrive "Files On-Demand" (and iCloud "dataless" files) to download. That is slow, burns quota, and pops OneDrive sync warnings. A new metadata-only
core::cloud_filescheck (WindowsFILE_ATTRIBUTE_OFFLINE/RECALL_ON_OPEN/RECALL_ON_DATA_ACCESS, macOSSF_DATALESS) is now afilter_entrypredicate on every walker (resident search index,ctx_search, graph, BM25,ctx_tree), so a placeholder file or folder is pruned before it is ever opened — detection reads attributes only and never triggers a download. The resident search index also gained theis_safe_scan_rootguard the graph/BM25 builders already had (so it never auto-indexes$HOME), and the common cloud roots (OneDrive,Dropbox,Google Drive) are blocked as scan roots. - Proxy stops 401-ing OpenCode's OpenAI-compatible provider keys (#362): the proxy's loopback auth gate only accepted
Authorization: Bearer sk-…/gsk_…as a provider credential, so OpenCode (@ai-sdk/openai) pointed at an OpenAI-compatible upstream — Azure, OpenRouter, Groq, a local vLLM/Ollama gateway, or a project/service key — was rejected with401 Unauthorized — lean-ctx proxy requires authentication, even though #353 had already fixed the bare-/responsesrouting. On a provider route the gate now accepts any non-empty credential (the proxy binds to loopback only and forwards the header verbatim, so the real upstream still validates the key); a missing, empty, or bare-schemeAuthorizationis still rejected.
Upgrade
lean-ctx update # recommended (auto-downloads + refreshes shell hooks)
cargo install lean-ctx # or
npm update -g lean-ctx-bin # or
brew upgrade lean-ctxNote: After upgrading via cargo/npm/brew, run
lean-ctx setupto refresh shell aliases.lean-ctx updatedoes this automatically.
Full Changelog: v3.7.5...v3.7.5