github deepset-ai/haystack v2.18.0

15 hours ago

⭐️ Highlights

🔁 Pipeline Error Recovery with Snapshots

Pipelines now capture a snapshot of the last successful step when a run fails, including intermediate outputs. This lets you diagnose issues (e.g., failed tool calls), fix them, and resume from the checkpoint instead of restarting the entire run. Currently supported for synchronous Pipeline and Agent (not yet in AsyncPipeline)

The snapshot is part of the exception raised with the PipelineRuntimeError when the pipeline run fails. You need to wrap your pipeline.run() in a try-except block.

try:
  pipeline.run(data=input_data)
except PipelineRuntimeError as exc_info
	snapshot = exc_info.value.pipeline_snapshot
	intermediate_outputs = pipeline_snapshot.pipeline_state.pipeline_outputs

# Snapshot can be used to resume the execution of a Pipeline by passing it to the run() method using the snapshot argument	
pipeline.run(data={}, snapshot=saved_snapshot)

🧠 Structured Outputs for OpenAI/Azure OpenAI

OpenAIChatGenerator and AzureOpenAIChatGenerator support structured outputs via response_format (Pydantic model or JSON schema).

from pydantic import BaseModel
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage

class CalendarEvent(BaseModel):
    event_name: str
    event_date: str
    event_location: str

generator = OpenAIChatGenerator(generation_kwargs={"response_format": CalendarEvent})

message = "The Open NLP Meetup is going to be in Berlin at deepset HQ on September 19, 2025"
result = generator.run([ChatMessage.from_user(message)])
print(result["replies"][0].text)

# {"event_name":"Open NLP Meetup","event_date":"September 19","event_location":"deepset HQ, Berlin"}

🛠️ Convert Pipelines into Tools with PipelineTool

The new PipelineTool lets you expose entire Haystack Pipelines as LLM-compatible tools. It simplifies the previous SuperComponent + ComponentTool pattern into a single abstraction and directly exposes input_mapping and output_mapping for fine-grained control.

from haystack import Pipeline
from haystack.tools import PipelineTool

retrieval_pipeline = Pipeline()
retrieval_pipeline.add_component...
..

retrieval_tool = PipelineTool(
    pipeline=retrieval_pipeline,
    input_mapping={"query": ["bm25_retriever.query"]},
    output_mapping={"ranker.documents": "documents"},
    name="retrieval_tool",
    description="Use to retrieve documents",
)

🗺️ Runtime System Prompt for Agents

Agent’s system_prompt can now be updated dynamically at runtime for more flexible behavior.

🚀 New Features

  • OpenAIChatGenerator and AzureOpenAIChatGenerator now support structured outputs using response_format parameter that can be passed in generation_kwargs. The response_format parameter can be a Pydantic model or a JSON schema for non-streaming responses. For streaming responses, the response_format must be a JSON schema. Example usage of the response_format parameter:

    from pydantic import BaseModel
    from haystack.components.generators.chat import OpenAIChatGenerator
    from haystack.dataclasses import ChatMessage
    
    class NobelPrizeInfo(BaseModel):
        recipient_name: str
        award_year: int
        category: str
        achievement_description: str
        nationality: str
    
    client = OpenAIChatGenerator(
        model="gpt-4o-2024-08-06",
        generation_kwargs={"response_format": NobelPrizeInfo}
    )
    
    response = client.run(messages=[
        ChatMessage.from_user("In 2021, American scientist David Julius received the Nobel Prize in"
        " Physiology or Medicine for his groundbreaking discoveries on how the human body"
        " senses temperature and touch.")
    ])
    print(response["replies"][0].text)
    >>> {"recipient_name":"David Julius","award_year":2021,"category":"Physiology or Medicine","achievement_description":"David Julius was awarded for his transformative findings regarding the molecular mechanisms underlying the human body's sense of temperature and touch. Through innovative experiments, he identified specific receptors responsible for detecting heat and mechanical stimuli, ranging from gentle touch to pain-inducing pressure.","nationality":"American"}
  • Added PipelineTool, a new tool wrapper that allows Haystack Pipelines to be exposed as LLM-compatible tools.

    • Previously, this was achievable by first wrapping a pipeline in a SuperComponent and then passing it to ComponentTool.
    • PipelineTool streamlines that pattern into a dedicated abstraction. It uses the same approach under the hood but directly exposes input_mapping and output_mapping so users can easily control which pipeline inputs and outputs are made available.
    • Automatically generates input schemas for LLM tool calling from pipeline inputs.
    • Extracts descriptions from underlying component docstrings for better tool documentation.
    • Can be passed directly to an Agent, enabling seamless integration of full pipelines as tools in multi-step reasoning workflows.
  • Add a reasoning field to StreamingChunk that optionally takes in a ReasoningContent dataclass. This is to allow a structured way to pass reasoning contents to streaming chunks.

  • If an error occurs during the execution of a pipeline, the pipeline will raise an PipelineRuntimeError exception containing an error message and the components outputs up to the point of failure. This allows you to inspect and debug the pipeline up to the point of failure.

  • LinkContentFetcher: add request_headers to allow custom per-request HTTP headers. Header precedence: httpx client defaults → component defaults → request_headers → rotating User-Agent. Also make HTTP/2 handling import-safe: if h2 isn’t installed, fall back to HTTP/1.1 with a warning. Thanks @xoaryaa. (Fixes #9064)

  • A snapshot of the last successful step is also raised when an error occurs during a Pipeline run. Allowing the caller to catch it to inspect the possible reason for crash and use it to resume the pipeline execution from that point onwards.

  • Add exclude_subdomains parameter to SerperDevWebSearch component. When set to True, this parameter restricts search results to only the exact domains specified in allowed_domains, excluding any subdomains. For example, with allowed_domains=\["example.com"\] and exclude_subdomains=True, results from "blog.example.com" or "shop.example.com" will be filtered out, returning only results from "example.com". The parameter defaults to False to maintain backward compatibility with existing behavior.

⚡️ Enhancement Notes

  • Added system_prompt to agent run parameters to enhance customization and control over agent behavior.
  • The internal Agent logic was refactored to help with readability and maintanability. This should help developers understand and extend the internal Agent logic moving forward.

🐛 Bug Fixes

  • Reintroduce verbose error message when deserializing a ChatMessage with invalid content parts. While LLMs may still generate messages in the wrong format, this error provides guidance on the expected structure, making retries easier and more reliable during agent runs. The error message was unintentionally removed during a previous refactoring.
  • The English and German abbreviation files used by the SentenceSplitter are now included in the distribution. They were previously missing due to a config in the .gitignore file.
  • Preserve explicit lambda_threshold=0.0 in SentenceTransformersDiversityRanker instead of overriding it with 0.5 due to short-circuit evaluation.
  • Fix MetaFieldGroupingRanker to still work when subgroup_by values are unhashable types like list. We handle this by stringfying the contents of doc.meta\[subgroup_by\] in the same we do this for values of doc.meta\[group_by\].
  • Fixed missing trace parentage for tools executed via the synchronous ToolInvoker path. Updated ToolInvoker.run() to propagate contextvars into ThreadPoolExecutor workers, ensuring all tool spans (ComponentTool, Agent wrapped in ComponentTool, or custom tools) are correctly linked to the outer Agent's trace instead of starting new root traces. This improves end-to-end observability across the entire tool execution chain.
  • Fixed the from_dict method of MetadataRouter so the output_type parameter introduced in Haystack 2.17 is now optional when loading from YAML. This ensures compatibility with older Haystack pipelines.
  • In OpenAIChatGenerator, improved the logic to exclude unsupported custom tool calls. The previous implementation caused compatibility issues with the Mistral Haystack core integration, which extends OpenAIChatGenerator.
  • Fixed parameter schema generation in ComponentTool when using inputs_from_state. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example, inputs_from_state={"text": "text"} removed text as expected, but inputs_from_state={"state_text": "text"} did not. This is now resolved, and such cases work as intended.

💙 Big thank you to everyone who contributed to this release!

@Amnah199, @Ujjwal-Bajpayee, @abdokaseb, @anakin87, @davidsbatista, @dfokina, @rigved-telang, @sjrl, @tstadel, @vblagoje, @xoaryaa

Don't miss a new haystack release

NewReleases is sending notifications on new releases.