github deepset-ai/haystack v2.19.0

20 hours ago

⭐️ Highlights

🛡️ Try Multiple LLMs with FallbackChatGenerator

Introduced FallbackChatGenerator, a resilient chat generator that runs multiple LLMs sequentially and automatically falls back when one fails. It tries each generator in order until one succeeds, handling errors like timeouts, rate limits, or server issues. Ideal for building robust, production-grade chat systems that stay responsive across providers.

from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator
from haystack_integrations.components.generators.anthropic import AnthropicChatGenerator
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.components.generators.chat.fallback import FallbackChatGenerator

anthropic_generator = AnthropicChatGenerator(model="claude-sonnet-4-5", timeout=1) # force failure with low timeout
google_generator = GoogleGenAIChatGenerator(model="gemini-2.5-flashy") # force failure with typo in model name
openai_generator = OpenAIChatGenerator(model="gpt-4o-mini") # success

chat_generator = FallbackChatGenerator(chat_generators=[anthropic_generator, google_generator, openai_generator])
response = chat_generator.run(messages=[ChatMessage.from_user("What is the plot twist in Shawshank Redemption?")])

print("Successful ChatGenerator: ", response["meta"]["successful_chat_generator_class"])
print("Response: ", response["replies"][0].text)

Output:

WARNING:haystack.components.generators.chat.fallback:ChatGenerator AnthropicChatGenerator failed with error: Request timed out or interrupted...
WARNING:haystack.components.generators.chat.fallback:ChatGenerator GoogleGenAIChatGenerator failed with error: Error in Google Gen AI chat generation: 404 NOT_FOUND...
Successful ChatGenerator:   OpenAIChatGenerator
Response:  In "The Shawshank Redemption," ....

🛠️ Mix Tool and Toolset in Agents

You can now combine both Tool and Toolset objects in the same tools list for Agent and ToolInvoker components. This update brings more flexibility, letting you organize tools into logical groups while still adding standalone tools in one go.

from haystack.components.agents import Agent
from haystack.tools import Tool, Toolset

math_toolset = Toolset([add_tool, multiply_tool])
weather_toolset = Toolset([weather_tool, forecast_tool])

agent = Agent(
    chat_generator=generator,
    tools=[math_toolset, weather_toolset, calendar_tool],  # ✨ Now supported!
)

⚙️ Faster Agents with Tool Warmup

Tool and Toolset objects can now perform initialization during Agent or ToolInvoker warmup. This allows setup tasks such as connecting to databases, loading models, or initializing connection pools before the first use.

from haystack.tools import Toolset
from haystack.components.agents import Agent

# Custom toolset with initialization needs
class DatabaseToolset(Toolset):
    def __init__(self, connection_string):
        self.connection_string = connection_string
        self.pool = None
        super().__init__([query_tool, update_tool])
        
    def warm_up(self):
        # Initialize connection pool
        self.pool = create_connection_pool(self.connection_string)

🚀 New Features

  • Updated our serialization and deserialization of PipelineSnapshots to work with python Enum classes.

  • Added FallbackChatGenerator that automatically retries different chat generators and returns first successful response with detailed information about which providers were tried.

  • Added pipeline_snapshot and pipeline_snapshot_file_path parameters to BreakpointException to provide more context when a pipeline breakpoint is triggered.
    Added pipeline_snapshot_file_path parameter to PipelineRuntimeError to include a reference to the stored pipeline snapshot so it can be easily found.

  • A new component RegexTextExtractor which allows to extract text from chat messages or strings input based on custom regex pattern.

  • CSVToDocument: add conversion_mode='row' with optional content_column; each row becomes a Document; remaining columns stored in meta; default 'file' mode preserved.

  • Added the ability to resume an Agent from an AgentSnapshot while specifying a new breakpoint in the same run call. This allows stepwise debugging and precise control over chat generator inputs tool inputs before execution, improving flexibility when inspecting intermediate states. This addresses a previous limitation where passing both a snapshot and a breakpoint simultaneously would throw an exception.

  • Introduce SentenceTransformersSparseTextEmbedder and SentenceTransformersSparseDocumentEmbedder components. These components embed text and documents using sparse embedding models compatible with Sentence Transformers. Sparse embeddings are interpretable, efficient when used with inverted indexes, combine classic information retrieval with neural models, and are complementary to dense embeddings. Currently, the produced SparseEmbedding objects are compatible with the QdrantDocumentStore.

    Usage example:

    from haystack.components.embedders import SentenceTransformersSparseTextEmbedder
    
    text_embedder = SentenceTransformersSparseTextEmbedder()
    text_embedder.warm_up()
    
    print(text_embedder.run("I love pizza!"))
    # {'sparse_embedding': SparseEmbedding(indices=[999, 1045, ...], values=[0.918, 0.867, ...])}
  • Added a warm_up() function to the Tool dataclass, allowing tools to perform resource-intensive initialization before execution. Tools and Toolsets can now override the warm_up() method to establish connections to remote services, load models, or perform other preparatory operations. The ToolInvoker and Agent automatically call warm_up() on their tools during their own warm-up phase, ensuring tools are ready before use.

  • Fixed a serialization issue related to function objects in a pipeline; now they are converted to type None (functions cannot be serialized). This was preventing the successful setting of breakpoints in agents and their use as a resume point. If an error occurs during an Agent execution, for instance, during tool calling. In that case, a snapshot of the last successful step is raised, allowing the caller to catch it to inspect the possible reason for the crash and use it to resume the pipeline execution from that point onwards.

⚡️ Enhancement Notes

  • Added tools to agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passing Tool objects or a Toolset.
  • Enhanced the tools parameter across all tool-accepting components (Agent, ToolInvoker, OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator) to accept either a mixed list of Tool and Toolset objects or just a Toolset object. Previously, components required either a list of Tool objects OR a single Toolset, but not both in the same list. Now users can organize tools into logical Toolsets while also including standalone Tool objects, providing greater flexibility in tool organization. For example: Agent(chat_generator=generator, tools=[math_toolset, weather_toolset, standalone_tool]). This change is fully backward compatible and preserves structure during serialization/deserialization, enabling proper round-trip support for mixed tool configurations.
  • Refactored _save_pipeline_snapshot to consolidate try-except logic and added a raise_on_failure option to control whether save failures raise an exception or are logged. _create_pipeline_snapshot now wraps _serialize_value_with_schema in try-except blocks to prevent failures from non-serializable pipeline inputs.

🐛 Bug Fixes

  • Fix Agent run_async method to correctly handle async streaming callbacks. This previously triggered errors due to a bug.
  • Prevent duplication of the last assistant message in the chat history when initializing from an AgentSnapshot.
  • We were setting response_format to None in OpenAIChatGenerator by default which doesn't follow the API spec. We now omit the variable if response_format is not passed by the user.
  • Ensure that the OpenAIChatGenerator is properly serialized when response_format in generation_kwargs is provided as a dictionary (for example, {"type": "json_object"}). Previously, this caused serialization errors.
  • Fixed parameter schema generation in ComponentTool when using inputs_from_state. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example, inputs_from_state={"text": "text"} removed text as expected, but inputs_from_state={"state_text": "text"} did not. This is now resolved, and such cases work as intended.
  • Refactored SentenceTransformersEmbeddingBackend to ensure unique embedding IDs by incorporating all relevant arguments.
  • Fixed Agent to correctly raise a BreakpointException when a ToolBreakpoint with a specific tool_name is provided in an assistant chat message containing multiple tool calls.
  • The OpenAIChatGenerator implementation uses ChatCompletionMessageCustomToolCall, which is only available in OpenAI client >=1.99.2. We now require openai>=1.99.2.

💙 Big thank you to everyone who contributed to this release!

@anakin87, @bilgeyucel, @davidsbatista, @dfokina, @Ryzhtus, @sjrl, @srini047, @tstadel, @vblagoje, @xoaryaa

Don't miss a new haystack release

NewReleases is sending notifications on new releases.