deepset-ai/haystack v2.29.0 on GitHub

⭐️ Highlights

🔍 Combine Retrievers with `MultiRetriever` and `TextEmbeddingRetriever`

Two new retriever components make it easier to build hybrid search pipelines. MultiRetriever runs multiple text retrievers in parallel and merges their results into a single deduplicated list, ranked by reciprocal rank fusion by default. You can selectively enable or disable individual retrievers at runtime using the active_retrievers parameter. This is useful when you want to skip the embedding retriever for short or keyword-only queries, for example.

TextEmbeddingRetriever wraps an embedding-based retriever together with a text embedder into a single component, making it compatible with MultiRetriever by implementing the TextRetriever protocol. Here's how to combine BM25 and embedding retrieval in a single component:

from haystack.components.retrievers import MultiRetriever, TextEmbeddingRetriever
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever, InMemoryEmbeddingRetriever
from haystack.components.embedders import SentenceTransformersTextEmbedder

retriever = MultiRetriever(
    retrievers={
        "bm25": InMemoryBM25Retriever(document_store=doc_store),
        "embedding": TextEmbeddingRetriever(
            retriever=InMemoryEmbeddingRetriever(document_store=doc_store),
            text_embedder=SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"),
        ),
    },
    top_k=3,
)

# Run all retrievers
result = retriever.run(query="green energy sources")

# Run only the BM25 retriever
result = retriever.run(query="green energy sources", active_retrievers=["bm25"])

⬆️ Upgrade Notes

LLM.run and LLM.run_async no longer accept messages and streaming_callback as positional arguments — they must now be passed as keyword arguments. Update any direct calls accordingly:
```
# Before
llm.run([message], my_callback)

# After
llm.run(messages=[message], streaming_callback=my_callback)
```

🚀 New Features

Add run_async to CacheChecker, enabling it to be used in AsyncPipeline without blocking the event loop.

⚡️ Enhancement Notes

Document the input ordering behavior of auto-promoted lazy variadic sockets in Pipeline.connect(). When multiple senders are connected to the same list-typed receiver socket, ordering depends on the pipeline class. With Pipeline, items are ordered alphabetically by sender component name (because Pipeline.run() schedules components in alphabetical order for deterministic execution), not by the order of connect() calls. With AsyncPipeline, no ordering is guaranteed, since components in different branches may run in parallel. The docstrings now point users to a dedicated joiner component when they need explicit ordering.
Add join_mode parameter to the experimental MultiRetriever component, supporting "reciprocal_rank_fusion" (default) and "concatenate". Reciprocal Rank Fusion merges the ranked result lists from all retrievers into a single deduplicated list ordered by RRF score. The underlying RRF logic is extracted into a shared utility _reciprocal_rank_fusion in haystack.utils.misc, which is now also used by DocumentJoiner.
LLM now supports two usage modes:
1. Template-variable mode: provide a user_prompt with Jinja2 variables (e.g. {{ query }}).
  Those variables become pipeline inputs and messages is optional. The rendered user_prompt
  is always appended after any messages provided at runtime.
2. Pass-through mode: omit user_prompt or provide one with no template variables. messages
  becomes a required input, allowing a fully-constructed list of ChatMessages to be passed from upstream.

🐛 Bug Fixes

Fixed a bug in NamedEntityExtractor where the spaCy/Thinc device state was not correctly restored after execution, potentially affecting the device configuration of other spaCy components in the same process.
Preserve resumable snapshots when some inputs or outputs are non-serializable. Haystack now omits only the failing top-level fields (for example non-serializable callbacks or runtime objects) instead of replacing the whole payload with an empty dictionary. This applies both to agent sub-component inputs (chat_generator and tool_invoker) and to pipeline-level inputs, original_input_data, and pipeline_outputs captured by _create_pipeline_snapshot. When every field fails to serialize, the snapshot still stores a structurally valid empty payload ({"serialization_schema": {"type": "object", "properties": {}}, "serialized_data": {}}) so that resuming the snapshot does not raise DeserializationError — for example when resuming from a ToolBreakpoint where the sub-component's inputs are not strictly required.
Fixed tools_strict=True in OpenAIChatGenerator to recursively apply additionalProperties: false and required to all nested objects in tool parameter schemas. Previously only the top-level object was transformed, causing OpenAI's strict mode to reject tools with nested parameters.

💙 Big thank you to everyone who contributed to this release!

@Aftabbs, @albertodiazdurana, @anakin87, @ArkaD171717, @bilgeyucel, @bogdankostic, @davidsbatista, @FuturMix, @julian-risch, @kacperlukawski, @ritikraj2425, @saivedant169, @shaun0927, @sjrl, @SyedShahmeerAli12