Introduces an advanced, modular search pipeline with hybrid (neural + keyword) retrieval, query interpretation/expansion, recency bias, and optional LLM reranking/completions, powered by a new Qdrant BM25 index. Refactors entity models to mark embeddable fields and unify system metadata for better chunking and indexing.
New Features
New pipeline (SearchServiceV2) with pluggable operations: query interpretation, query expansion, embeddings (dense + sparse), vector search, recency bias, LLM reranking, and completion generation.
Hybrid search in Qdrant: adds a BM25 sparse vector index alongside neural vectors, with automatic fallback to neural-only when keyword vectors are missing.
Time decay/recency bias: dynamic decay config computed per query and applied in Qdrant formula scoring.
API: collections search endpoint now uses the new service and enhanced schemas; basic requests still work, advanced knobs are opt-in.
Added BM25Text2Vec via fastembed to generate sparse embeddings; Qdrant destination updated for multi-vector support and keyword index checks.
Migration
No breaking changes for existing search calls; defaults preserve current behavior.
To benefit from hybrid search on existing data, run a re-sync to populate keyword vectors; new data will be indexed automatically.
Install new dependency: fastembed.