⭐️ Highlights
🔍 Combine Retrievers with MultiRetriever and TextEmbeddingRetriever
Two new retriever components make it easier to build hybrid search pipelines. MultiRetriever runs multiple text retrievers in parallel and merges their results into a single deduplicated list, ranked by reciprocal rank fusion by default. You can selectively enable or disable individual retrievers at runtime using the active_retrievers parameter. This is useful when you want to skip the embedding retriever for short or keyword-only queries, for example.
TextEmbeddingRetriever wraps an embedding-based retriever together with a text embedder into a single component, making it compatible with MultiRetriever by implementing the TextRetriever protocol. Here's how to combine BM25 and embedding retrieval in a single component:
from haystack.components.retrievers import MultiRetriever, TextEmbeddingRetriever
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever, InMemoryEmbeddingRetriever
from haystack.components.embedders import SentenceTransformersTextEmbedder
retriever = MultiRetriever(
retrievers={
"bm25": InMemoryBM25Retriever(document_store=doc_store),
"embedding": TextEmbeddingRetriever(
retriever=InMemoryEmbeddingRetriever(document_store=doc_store),
text_embedder=SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"),
),
},
top_k=3,
)
# Run all retrievers
result = retriever.run(query="green energy sources")
# Run only the BM25 retriever
result = retriever.run(query="green energy sources", active_retrievers=["bm25"])⬆️ Upgrade Notes
-
LLM.runandLLM.run_asyncno longer acceptmessagesandstreaming_callbackas positional arguments — they must now be passed as keyword arguments. Update any direct calls accordingly:# Before llm.run([message], my_callback) # After llm.run(messages=[message], streaming_callback=my_callback)
🚀 New Features
- Add
run_asynctoCacheChecker, enabling it to be used inAsyncPipelinewithout blocking the event loop.
⚡️ Enhancement Notes
- Document the input ordering behavior of auto-promoted lazy variadic sockets in
Pipeline.connect(). When multiple senders are connected to the same list-typed receiver socket, ordering depends on the pipeline class. WithPipeline, items are ordered alphabetically by sender component name (becausePipeline.run()schedules components in alphabetical order for deterministic execution), not by the order ofconnect()calls. WithAsyncPipeline, no ordering is guaranteed, since components in different branches may run in parallel. The docstrings now point users to a dedicated joiner component when they need explicit ordering. - Add
join_modeparameter to the experimentalMultiRetrievercomponent, supporting"reciprocal_rank_fusion"(default) and"concatenate". Reciprocal Rank Fusion merges the ranked result lists from all retrievers into a single deduplicated list ordered by RRF score. The underlying RRF logic is extracted into a shared utility_reciprocal_rank_fusioninhaystack.utils.misc, which is now also used byDocumentJoiner. LLMnow supports two usage modes:- Template-variable mode: provide a
user_promptwith Jinja2 variables (e.g.{{ query }}).
Those variables become pipeline inputs andmessagesis optional. The rendereduser_prompt
is always appended after anymessagesprovided at runtime. - Pass-through mode: omit
user_promptor provide one with no template variables.messages
becomes a required input, allowing a fully-constructed list ofChatMessages to be passed from upstream.
- Template-variable mode: provide a
🐛 Bug Fixes
- Fixed a bug in
NamedEntityExtractorwhere the spaCy/Thinc device state was not correctly restored after execution, potentially affecting the device configuration of other spaCy components in the same process. - Preserve resumable snapshots when some inputs or outputs are non-serializable. Haystack now omits only the failing top-level fields (for example non-serializable callbacks or runtime objects) instead of replacing the whole payload with an empty dictionary. This applies both to agent sub-component inputs (
chat_generatorandtool_invoker) and to pipeline-levelinputs,original_input_data, andpipeline_outputscaptured by_create_pipeline_snapshot. When every field fails to serialize, the snapshot still stores a structurally valid empty payload ({"serialization_schema": {"type": "object", "properties": {}}, "serialized_data": {}}) so that resuming the snapshot does not raiseDeserializationError— for example when resuming from aToolBreakpointwhere the sub-component's inputs are not strictly required. - Fixed
tools_strict=TrueinOpenAIChatGeneratorto recursively applyadditionalProperties: falseandrequiredto all nested objects in tool parameter schemas. Previously only the top-level object was transformed, causing OpenAI's strict mode to reject tools with nested parameters.
💙 Big thank you to everyone who contributed to this release!
@Aftabbs, @albertodiazdurana, @anakin87, @ArkaD171717, @bilgeyucel, @bogdankostic, @davidsbatista, @FuturMix, @julian-risch, @kacperlukawski, @ritikraj2425, @saivedant169, @shaun0927, @sjrl, @SyedShahmeerAli12