deepset-ai/haystack v2.20.0 on GitHub

⭐️ Highlights

Support for OpenAI's Responses API

Haystack now integrates the OpenAI's Responses API through the new OpenAIResponsesChatGenerator and AzureOpenAIResponsesChatGenerator components.

This unlocks several advanced capabilities like:

Retrieving concise summaries of the model’s reasoning process.
Using native OpenAI or MCP tool formats alongside Haystack Tool objects and Toolset instances.

Example with reasoning and a web search tool:

from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

# with `OpenAIResponsesChatGenerator`
chat_generator = OpenAIResponsesChatGenerator(
    model="o3-mini",
    generation_kwargs={"summary": "auto", "effort": "low"},
    tools=[{"type": "web_search"}],
)
response = chat_generator.run(messages=[ChatMessage.from_user("What's a positive news story from today?")])

# with `AzureOpenAIResponsesChatGenerator`
chat_generator = AzureOpenAIResponsesChatGenerator(
    azure_endpoint="https://example-resource.azure.openai.com/",
    azure_deployment="gpt-5-mini",
    generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}},
)
response = chat_generator.run(messages=[ChatMessage.from_user("What's Natural Language Processing?")])

print(response["replies"][0].text)

🚀 New Features

Added the AzureOpenAIResponsesChatGenerator, a new component that integrates Azure OpenAI's Responses API into Haystack.
Added the OpenAIResponsesChatGenerator, a new component that integrates OpenAI's Responses API into Haystack.
If logprobs are enabled in the generation kwargs, return logprobs in ChatMessage.meta for OpenAIChatGenerator and OpenAIResponsesChatGenerator.
Added an extra field to ToolCall and ToolCallDelta to store provider-specific information.
Updated serialization and deserialization of PipelineSnapshots to work with pydantic BaseModels.
Added async support to SentenceWindowRetriever with a new run_async() method, allowing the retriever to be used in async pipelines and workflows.
Added warm_up() method to all ChatGenerator components (OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator, and FallbackChatGenerator) to properly initialize tools that require warm-up before pipeline execution. The warm_up() method is idempotent and follows the same pattern used in Agent and ToolInvoker components. This enables proper tool initialization in pipelines that use ChatGenerators with tools but without an Agent component.
The AnswerBuilder component now exposes a new parameter return_only_referenced_documents (default: True) that controls if only documents referenced in the replies are returned. Returned documents include two new fields in the meta dictionary:
- source_index: the 1-based index of the document in the input list
- referenced: a boolean value indicating if the document was referenced in the replies (only present if the reference_pattern parameter is provided).
  These additions make it easier to display references and other sources within a RAG pipeline.

⚡️ Enhancement Notes

Adds generation_kwargs to the Agent component, allowing for more fine-grained control at run-time over chat generation.
Added a revision parameter to all Sentence Transformers embedder components (SentenceTransformersDocumentEmbedder, SentenceTransformersTextEmbedder, SentenceTransformersSparseDocumentEmbedder, and SentenceTransformersSparseTextEmbedder) to allow users to specify a specific model revision/version from the Hugging Face Hub. This enables pinning to a particular model version for reproducibility and stability.
Updated the components Agent, LLMMetadataExtractor, LLMMessagesRouter, and LLMDocumentContentExtractor to automatically call self.warm_up() at runtime if they have not been warmed up yet. This ensures that the components are ready for use without requiring an explicit warm-up call. This differs from previous behavior where warm-up had to be manually invoked before use, otherwise a RuntimeError was raised.
Improved log-trace correlation for DatadogTracer by using the official ddtrace.tracer.get_log_correlation_context() method.
Improved Toolset warm-up architecture for better encapsulation. The base Toolset.warm_up() method now warms up all tools by default, while subclasses can override it to customize initialization (e.g., setting up shared resources instead of warming individual tools). The warm_up_tools() utility function has been simplified to delegate to Toolset.warm_up().

🐛 Bug Fixes

Fixed deserialization of state schema when it is None in Agent.from_dict.
Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.
Fixed type compatibility issue where passing list[Tool] to components with a tools parameter (such as ToolInvoker) caused static type checker errors.
In version 2.19, the ToolsType was changed to Union[list[Union[Tool, Toolset]], Toolset] to support mixing Tools and Toolsets. However, due to Python's list invariance, list[Tool] was no longer considered compatible with list[Union[Tool, Toolset]], breaking type checking for the common pattern of passing a list of Tool objects.
The fix explicitly lists all valid type combinations in ToolsType: Union[list[Tool], list[Toolset], list[Union[Tool, Toolset]], Toolset]. This preserves backward compatibility for existing code while still supporting the new functionality of mixing Tools and Toolsets.
Users who encountered type errors like "Argument of type 'list[Tool]' cannot be assigned to parameter 'tools'" should no longer see these errors after upgrading. No code changes are required on the user side.
When creating a pipeline snapshot, we now ensure use of _deepcopy_with_exceptions when copying component inputs to avoid deep copies of items like components and tools since they often contain attributes that are not deep-copyable.
For example, the LinkContentFetcher has httpx.Client as an attribute, which throws an error if deep-copied.