deepset-ai/haystack v2.25.0-rc1 on GitHub

⭐️ Highlights

🛠️ Dynamic Tool Discovery with `SearchableToolset`

For applications with large tool catalogs, we’ve added the SearchableToolset. Instead of exposing all tools upfront, agents start with a single search_tools function and dynamically discover relevant tools using BM25-based keyword search.

This is particularly useful when connecting MCP servers via MCPToolset, where many tools may be available. By combining the two, agents can load only the tools they actually need at runtime, reducing context usage and improving tool selection.

from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.tools import Tool, SearchableToolset

# Create a catalog of tools
catalog = [
    Tool(name="get_weather", description="Get weather for a city", ...),
    Tool(name="search_web", description="Search the web", ...),
    # ... 100s more tools
]
toolset = SearchableToolset(catalog=catalog)

agent = Agent(chat_generator=OpenAIChatGenerator(), tools=toolset)

# The agent is initially provided only with the search_tools tool and will use it to find relevant tools.
result = agent.run(messages=[ChatMessage.from_user("What's the weather in Milan?")])

📝 Reusable Prompt Templates for Agents

Agents now natively support Jinja2-templated user prompts. By defining a user_prompt and required_variables during initialization or at runtime, you can easily invoke the Agent with dynamic variables without having to manually build ChatMessage objects for every invocation. Plus, you can seamlessly append rendered prompts directly to prior conversation messages.

from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage

agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=tools,
    system_prompt="You are a helpful translation assistant.",
    user_prompt="""{% message role="user"%}
        Now summarize the conversation in {{ language }}.
    {% endmessage %}""",
    required_variables=["language"],
)

result = agent.run(
    messages=[
        ChatMessage.from_user("What are the main benefits of renewable energy?"),
        ChatMessage.from_assistant("Renewable energy reduces greenhouse gas emissions, decreases dependence on fossil fuels, and can lower long-term energy costs."),
    ],
    language="Spanish",
)

⬆️ Upgrade Notes

Removed the deprecated PipelineTemplate and PredefinedPipeline classes, along with the Pipeline.from_template() method. Users should migrate to Pipeline YAML files for similar functionality. See the [Serialization documentation](https://docs.haystack.deepset.ai/docs/serialization) for details on using YAML-based pipeline definitions.
Default Hugging Face pipeline task updated to ``text-generation``
The default task used by HuggingFaceLocalGenerator has been changed from text2text-generation to text-generation and the default model has been changed from "google/flan-t5-base" to "Qwen/Qwen3-0.6B".
In transformers v5+, text2text-generation is no longer available as a valid pipeline task (see: huggingface/transformers#43256). While parts of the implementation still exist internally, it is no longer supported as a straightforward pipeline option.
How to know if you are affected
- You are using transformers>=5.0.0.
- You explicitly set task="text2text-generation" in HuggingFaceLocalGenerator or HuggingFaceLocalChatGenerator.
How to handle this change
- Replace task="text2text-generation" with task="text-generation".
- Ensure that the selected model is compatible with the text-generation pipeline (for example, causal language models).
- If you rely on older behavior, pin transformers<5.
- text2text-generation is now considered deprecated in Haystack and may be removed in a future release.

🚀 New Features

Added link_format parameter to PPTXToDocument and XLSXToDocument converters, allowing extraction of hyperlink addresses from PPTX and XLSX files.
Supported formats:
- "markdown": [text](url)
- "plain": text (url)
- "none" (default): Only text is extracted, link addresses are ignored.
This follows the same pattern already available in DOCXToDocument.

Added a new LLM component (haystack.components.generators.chat.LLM) that provides a simplified interface for text generation powered by a large language model. The LLM component is a streamlined version of the Agent that focuses solely on single-turn text generation without tool usage. It supports system prompts, templated user prompts with required variables, streaming callbacks, and both synchronous (run) and asynchronous (run_async) execution.

Usage example:

from haystack.components.generators.chat import LLM
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage

llm = LLM(
    chat_generator=OpenAIChatGenerator(),
    system_prompt="You are a helpful translation assistant.",
    user_prompt="""{% message role="user"%}
Summarize the following document: {{ document }}
{% endmessage %}""",
    required_variables=["document"],
)

result = llm.run(document="The weather is lovely today and the sun is shining. ")
print(result["last_message"].text)

Added SearchableToolset to haystack.tools module. This new toolset enables agents to dynamically discover tools from large catalogs using keyword-based (BM25) search. Instead of exposing all tools upfront (which can overwhelm LLMs with large tool definitions), agents start with a single search_tools function and progressively discover relevant tools as needed. For smaller catalogs, it operates in passthrough mode exposing all tools directly.
Key features include configurable search threshold for automatic passthrough mode and top-k result limiting.

Added user_prompt and required_variables parameters to the Agent component. You can now define a reusable Jinja2-templated user prompt at initialization or at runtime, so the Agent can be invoked with different inputs without manually constructing ChatMessage objects each time.

from haystack.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator

agent = Agent(
    chat_generator=OpenAIChatGenerator(),
    tools=tools,
    system_prompt="You are a helpful translation assistant.",
    user_prompt="""{% message role="user"%}
    Translate the following document to {{ language }}: {{ document }}
    {% endmessage %}""",
    required_variables=["language", "document"],
)

result = agent.run(language="French", document="The weather is lovely today.")

When you combine messages with user_prompt, the rendered user prompt is appended to the provided messages. This is useful for passing prior conversation context alongside a new templated query.

Added the FileToFileContent component, which converts local files into FileContent objects. These can be embedded into ChatMessage to pass to an LLM.
Added document_comparison_field parameter to DocumentMRREvaluator, DocumentMAPEvaluator, and DocumentRecallEvaluator.
This allows users to compare documents using fields other than content, such as id or metadata keys (via meta.<key> syntax).
Previously, all three evaluators hardcoded doc.content for comparison, which did not work well when documents were chunked or when ground truth was identified by custom metadata fields.

⚡️Enhancement Notes

The LLMDocumentContentExtractor now extracts both content and metadata from image-based documents. When the LLM returns JSON, document_content fills the document body and other keys are merged into metadata; plain text is still used as content. The field content_extraction_error is no longer used and when an error occurs the field extraction_erroris added to metadata with the error message.
Improved the deserialization error message for pipeline components to be more actionable and human-readable. The component data dictionary is now pretty-printed as formatted JSON, and the underlying error that caused the failure is explicitly surfaced, making it easier to quickly diagnose deserialization issues.
EmbeddingBasedDocumentSplitter and MultiQueryEmbeddingRetriever now automatically invoke warm_up() when run() is called if they have not been warmed up yet.
Improved ComponentTool to correctly handle components whose run method parameters are declared as top-level Optional types such as list[ChatMessage] | None. The optional wrapper is now unwrapped before checking for a from_dict method on the underlying type. As a result, when a parameter is typed as list[ChatMessage] | None and receives a list of dictionaries, ComponentTool will automatically coerce the input into a list of ChatMessage objects using ChatMessage.from_dict. If the provided value is None, the parameter is preserved as None.

Haystack now emits a Warning when dataclass instances (e.g. Document, ChatMessage, StreamingChunk, ByteStream, SparseEmbedding) are mutated in place. Modifying shared instances can cause unexpected behavior in other parts of the pipeline. Use dataclasses.replace to safely create updated copies instead.

Instead of modifying attributes in place:

from haystack.dataclasses import Document

doc = Document(content="old text", meta={"key": "value"})

# Not recommended: can affect other parts of the pipeline
doc.content = "new text"

Use dataclasses.replace to create a new instance with the updated values:

from dataclasses import replace
from haystack.dataclasses import Document

doc = Document(content="old text", meta={"key": "value"})

# Recommended: creates a new Document with updated content
doc = replace(doc, content="new text")

🐛 Bug Fixes

Fixed an issue in OpenAIChatGenerator and OpenAIResponsesChatGenerator where passing a FileContent object without a filename would raise an error. A fallback filename is now automatically used instead.
Ensure type display works correctly for parameterized generics when tracing is enabled. Previously, the haystack.component.input_spec and haystack.component.output_spec tag would strip the arguments present within a container type (e.g. list[str] would become "list"). Now we properly keep the arguments representation (e.g. list[str] becomes "list[str]").
Previously, flexible pipeline connections were not robust when the sender component returned a list containing a union of types. In flexible pipeline connections, if the receiver type is str or ChatMessage, the first element of the list sent by the sender component is extracted and converted to the receiver type.
In the previous version, list[str | int] was considered compatible with str, which should not be the case. In fact, the sender component can legitimately return a list where the first element has type int, which the receiver cannot handle.
This is now fixed by ensuring that all possible element types of the sender list can be converted to the receiver type using the same conversion strategy.
Improved device handling when loading Hugging Face models in TransformersSimilarityRanker and ExtractiveReader.
hf_device_map is not always present anymore and is now only set when mixed-device loading is explicitly configured. The code has been updated to:
- Check whether hf_device_map is available.
- Fall back to the standard device attribute when it is not.
This prevents attribute errors and ensures compatibility across different transformers configurations.
Updated failing unit tests to align with recent mocking and transformers behavior changes.
PipelineRuntimeError raised by Agent now provide clearer ownership by explicitly surfacing the Agent as the failing pipeline component.
As a result:
- component_name now resolves to the name of the Agent in the pipeline, instead of the underlying chat_generator or tool_invoker.
- component_type now resolves to haystack.components.agents.agent.Agent instead of the concrete generator class such as haystack.components.generators.chat.openai.OpenAIChatGenerator.
- The error message now includes an additional outer section for the Agent component.
Example of new error message:
```
The following component failed to run:
Component name: 'agent'
Component type: 'Agent'
Error: The following component failed to run:
Component name: 'chat_generator'
Component type: 'OpenAIChatGenerator'
Error: Error code: 404 - {'error': {'message': 'The model ``gpt-4.2-mini`` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}
```

💙 Big thank you to everyone who contributed to this release!

@agnieszka-m, @Amanbig, @anakin87, @bilgeyucel, @bogdankostic, @davidsbatista, @edwiniac, @julian-risch, @kacperlukawski, @marc-mrt, @OGuggenbuehl, @OiPunk, @sjrl, @srini047, @vblagoje, @yaowubarbara