lightonai/next-plaid v1.3.0 on GitHub

Install colgrep 1.3.0

Install prebuilt binaries via shell script

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/lightonai/next-plaid/releases/download/v1.3.0/colgrep-installer.sh | sh

Install prebuilt binaries via powershell script

powershell -ExecutionPolicy Bypass -c "irm https://github.com/lightonai/next-plaid/releases/download/v1.3.0/colgrep-installer.ps1 | iex"

Download colgrep 1.3.0

File	Platform	Checksum
colgrep-aarch64-apple-darwin.tar.xz	Apple Silicon macOS	checksum
colgrep-x86_64-apple-darwin.tar.xz	Intel macOS	checksum
colgrep-x86_64-pc-windows-msvc.zip	x64 Windows	checksum
colgrep-x86_64-unknown-linux-gnu.tar.xz	x64 Linux	checksum

What's new in 1.3.0

Retrieval overhaul (+13 NDCG@10 on the semble benchmark)

End-to-end rework of ColGREP's hybrid retrieval and ranking. On the 1,251-query / 63-repo semble code-search benchmark, file-level NDCG@10 climbs from 0.693 (v1.2.0 baseline) to 0.825 — a +0.132 gain with no model change. Two layers stack:

Identifier-aware BM25. The FTS5 backend switches from the trigram tokenizer to a new IdentifierAware tokenizer that splits camelCase / PascalCase / snake_case identifiers (parseRequest → parserequest parse request) at both index time and query time. Queries are now joined with FTS5 OR instead of AND-quoted whole words, and fusion switches from rank-only RRF to min-max relative-score fusion at α=0.60. BM25 recall@200 jumps from 26.5 % to 99.6 %; raw BM25 NDCG@10 from 0.367 to 0.667.
Post-fusion ranking pipeline. The top-200 fused pool is re-scored by four signals applied in order: a multiplicative penalty for test/compat/example files (language-aware suffix regex, skipped when the query itself mentions test/spec/benchmark); a file-stem boost when query identifier tokens match the file name (full match +0.40·max, prefix match +0.20·max); a definition boost when a query token matches a code unit's name (+0.25·max); and a file-coherence boost that rewards files where multiple units score highly (+0.20·max scaled by file share). Results are then collapsed by file. Every threshold is overridable via COLGREP_* env vars, and COLGREP_TRACE=1 emits per-stage JSONL traces to stderr.

Indexes built before v1.3.0 are auto-detected as the wrong tokenizer and rebuilt on next colgrep init. The next-plaid-api default (unicode61) is unchanged.

GPU dynamic batching no longer scrambles embeddings

tokenize_documents_in_batches sorted documents by token length for the GPU bucketed path, but the original input order was discarded — callers (notably colgrep's indexer) then paired every code unit with the wrong embedding. Indexing succeeded and search ran, but results were unrelated to the query: on axios with LateOn-Code-edge, "request and response interceptors" returned helpers/speedometer.js instead of core/InterceptorManager.js. The full semble benchmark collapsed from ~0.69 to ~0.16 NDCG@10 on the first repos. Original input positions are now tracked through sort and bucket and restored before return. CPU and non-bucketed paths are unaffected.

Per-model index scoping

Indexes are now keyed by (project, model) instead of project alone. Switching from a 128-dim model (lightonai/LateOn) to a 48-dim one (lightonai/LateOn-Code-edge) previously panicked inside ndarray because the existing index's shape was incompatible. With this change, each model gets its own directory, colgrep set-model no longer wipes the existing index, and switching models back reuses the prior index with no re-indexing. colgrep clear is now scoped to the active model (--all still wipes everything); colgrep status surfaces the model each index was built with. Legacy project.json files without a model field are treated as orphaned and cleanable with colgrep clear --all.

Faster auto-indexing on the hot path (10×–1000×)

Two helpers in the try_index path were materializing every row of the METADATA table — including the full code column — just to inspect distinct file paths or find orphan IDs. A new filtering::get_distinct_strings(index_path, column) pushes the work into SQL with a validated identifier and a single SELECT DISTINCT, and reconcile_document_counts now uses where_condition to fetch orphan IDs directly. On a 95 k-unit index, cleanup_orphaned_entries (which runs on every search) drops from 342 ms to 32 ms, and reconcile_document_counts from 330 ms to 0.6 ms — eliminating the minute-long stalls users reported after the desync warning.

CSS and QML language support

.css files are now parsed by tree-sitter-css: each top-level rule set, @media, @keyframes, and @supports block becomes one Class-kind unit named by its selector list (e.g. .btn-primary:hover, :root, @media (max-width: 768px), @keyframes spin), and single-line at-rules (@import, @charset, @namespace) become Constant-kind units. .qml files are parsed by tree-sitter-qmljs and index object definitions, inline components, properties, signals, and handler bindings — including Quickshell-style grouped bindings and nested handlers. File-level comments and gaps are still swept up by fill_raw_code_gaps for full file coverage.

ONNX Runtime dylib validation

colgrep walks a candidate list (managed cache → conda → venv → Homebrew → system) for the ONNX Runtime dylib. The only prior admission check was path.exists(), so on Apple Silicon with Homebrew installed the walk could pick a libonnxruntime.dylib whose transitive dependencies were broken — and indexer panics during "📂 Building index…" surfaced as a bare exit 101 with no backtrace. Candidates are now validated before adoption (rejected ones warned about exactly once per session), and indexer panics now propagate with their backtrace instead of being swallowed.

Auto runtime-aware defaults resolved after ORT init

In auto mode, parallel and batch-size defaults were being concretized from Config::get_*() before ONNX Runtime / cuDNN had initialized — so on CUDA builds, is_cudnn_available() was still false and runs got frozen to CPU defaults (parallel=16, batch-size=1) even when the GPU was about to come online. The configured-override and effective-value concepts are now separated: configured_parallel_sessions() / configured_batch_size() keep None until IndexBuilder resolves defaults post-init, and legacy malformed 0 values are normalized through the same path.

Codex sandbox fix

Codex's default workspace-write sandbox blocks writes outside the active project, so colgrep's per-project indexes under $XDG_DATA_HOME/colgrep/indices were silently failing — falling back to a degraded shell-grep mode. colgrep --install-codex now registers the data dir in ~/.codex/config.toml under [sandbox_workspace_write].writable_roots using toml_edit (preserving user comments and ordering), is idempotent on re-run, and falls back to a copy-pasteable snippet when the user's config is unparseable. --uninstall-codex strips the entry, drops the table when empty, and deletes the file if colgrep was its sole occupant.

Hermes installer

New --install-hermes / --uninstall-hermes flags, alongside the existing Claude / OpenCode / Codex / Droid integrations. Install writes a marked ColGREP section into ~/.hermes/AGENTS.md; uninstall removes only that section.

Rerank input hardening

Non-finite similarity scores now sort behind finite results consistently in next-plaid, and rerank handlers share a single embedding-decoding path that rejects malformed or non-finite inputs with explicit errors instead of propagating NaN through the pipeline.

License

All crate and Python package metadata switched from MIT to Apache-2.0.

Contributors

Thanks to Raphael Bozydar (@RBozydar) for QML language support and the auto runtime-aware default fix; Jeff Coggshall (@voxmenthe) for pushing the auto-index hot-path filters into SQL with reproducible benchmarks; Anton Chen (@antonchen) for hardening non-finite score ordering and rerank embedding validation; and 0xArchiviste (@0xArchiviste) for the Hermes installer integration.