⭐️ Highlights
🛡️ Try Multiple LLMs with FallbackChatGenerator
Introduced FallbackChatGenerator
, a resilient chat generator that runs multiple LLMs sequentially and automatically falls back when one fails. It tries each generator in order until one succeeds, handling errors like timeouts, rate limits, or server issues. Ideal for building robust, production-grade chat systems that stay responsive across providers.
from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator
from haystack_integrations.components.generators.anthropic import AnthropicChatGenerator
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.components.generators.chat.fallback import FallbackChatGenerator
anthropic_generator = AnthropicChatGenerator(model="claude-sonnet-4-5", timeout=1) # force failure with low timeout
google_generator = GoogleGenAIChatGenerator(model="gemini-2.5-flashy") # force failure with typo in model name
openai_generator = OpenAIChatGenerator(model="gpt-4o-mini") # success
chat_generator = FallbackChatGenerator(chat_generators=[anthropic_generator, google_generator, openai_generator])
response = chat_generator.run(messages=[ChatMessage.from_user("What is the plot twist in Shawshank Redemption?")])
print("Successful ChatGenerator: ", response["meta"]["successful_chat_generator_class"])
print("Response: ", response["replies"][0].text)
Output:
WARNING:haystack.components.generators.chat.fallback:ChatGenerator AnthropicChatGenerator failed with error: Request timed out or interrupted...
WARNING:haystack.components.generators.chat.fallback:ChatGenerator GoogleGenAIChatGenerator failed with error: Error in Google Gen AI chat generation: 404 NOT_FOUND...
Successful ChatGenerator: OpenAIChatGenerator
Response: In "The Shawshank Redemption," ....
🛠️ Mix Tool
and Toolset
in Agents
You can now combine both Tool
and Toolset
objects in the same tools
list for Agent
and ToolInvoker
components. This update brings more flexibility, letting you organize tools into logical groups while still adding standalone tools in one go.
from haystack.components.agents import Agent
from haystack.tools import Tool, Toolset
math_toolset = Toolset([add_tool, multiply_tool])
weather_toolset = Toolset([weather_tool, forecast_tool])
agent = Agent(
chat_generator=generator,
tools=[math_toolset, weather_toolset, calendar_tool], # ✨ Now supported!
)
⚙️ Faster Agents with Tool Warmup
Tool
and Toolset
objects can now perform initialization during Agent
or ToolInvoker
warmup. This allows setup tasks such as connecting to databases, loading models, or initializing connection pools before the first use.
from haystack.tools import Toolset
from haystack.components.agents import Agent
# Custom toolset with initialization needs
class DatabaseToolset(Toolset):
def __init__(self, connection_string):
self.connection_string = connection_string
self.pool = None
super().__init__([query_tool, update_tool])
def warm_up(self):
# Initialize connection pool
self.pool = create_connection_pool(self.connection_string)
🚀 New Features
-
Updated our serialization and deserialization of PipelineSnapshots to work with python Enum classes.
-
Added
FallbackChatGenerator
that automatically retries different chat generators and returns first successful response with detailed information about which providers were tried. -
Added
pipeline_snapshot
andpipeline_snapshot_file_path
parameters toBreakpointException
to provide more context when a pipeline breakpoint is triggered.
Addedpipeline_snapshot_file_path
parameter toPipelineRuntimeError
to include a reference to the stored pipeline snapshot so it can be easily found. -
A new component
RegexTextExtractor
which allows to extract text from chat messages or strings input based on custom regex pattern. -
CSVToDocument: add
conversion_mode='row'
with optionalcontent_column
; each row becomes aDocument
; remaining columns stored inmeta
; default 'file' mode preserved. -
Added the ability to resume an
Agent
from anAgentSnapshot
while specifying a new breakpoint in the same run call. This allows stepwise debugging and precise control over chat generator inputs tool inputs before execution, improving flexibility when inspecting intermediate states. This addresses a previous limitation where passing both a snapshot and a breakpoint simultaneously would throw an exception. -
Introduce
SentenceTransformersSparseTextEmbedder
andSentenceTransformersSparseDocumentEmbedder
components. These components embed text and documents using sparse embedding models compatible with Sentence Transformers. Sparse embeddings are interpretable, efficient when used with inverted indexes, combine classic information retrieval with neural models, and are complementary to dense embeddings. Currently, the producedSparseEmbedding
objects are compatible with theQdrantDocumentStore
.Usage example:
from haystack.components.embedders import SentenceTransformersSparseTextEmbedder text_embedder = SentenceTransformersSparseTextEmbedder() text_embedder.warm_up() print(text_embedder.run("I love pizza!")) # {'sparse_embedding': SparseEmbedding(indices=[999, 1045, ...], values=[0.918, 0.867, ...])}
-
Added a
warm_up()
function to theTool
dataclass, allowing tools to perform resource-intensive initialization before execution. Tools and Toolsets can now override thewarm_up()
method to establish connections to remote services, load models, or perform other preparatory operations. TheToolInvoker
andAgent
automatically callwarm_up()
on their tools during their own warm-up phase, ensuring tools are ready before use. -
Fixed a serialization issue related to function objects in a pipeline; now they are converted to type None (functions cannot be serialized). This was preventing the successful setting of breakpoints in agents and their use as a resume point. If an error occurs during an Agent execution, for instance, during tool calling. In that case, a snapshot of the last successful step is raised, allowing the caller to catch it to inspect the possible reason for the crash and use it to resume the pipeline execution from that point onwards.
⚡️ Enhancement Notes
- Added
tools
to agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passingTool
objects or aToolset
. - Enhanced the
tools
parameter across all tool-accepting components (Agent
,ToolInvoker
,OpenAIChatGenerator
,AzureOpenAIChatGenerator
,HuggingFaceAPIChatGenerator
,HuggingFaceLocalChatGenerator
) to accept either a mixed list of Tool and Toolset objects or just a Toolset object. Previously, components required either a list of Tool objects OR a single Toolset, but not both in the same list. Now users can organize tools into logical Toolsets while also including standalone Tool objects, providing greater flexibility in tool organization. For example:Agent(chat_generator=generator, tools=[math_toolset, weather_toolset, standalone_tool])
. This change is fully backward compatible and preserves structure during serialization/deserialization, enabling proper round-trip support for mixed tool configurations. - Refactored
_save_pipeline_snapshot
to consolidate try-except logic and added araise_on_failure
option to control whether save failures raise an exception or are logged._create_pipeline_snapshot
now wraps_serialize_value_with_schema
in try-except blocks to prevent failures from non-serializable pipeline inputs.
🐛 Bug Fixes
- Fix Agent
run_async
method to correctly handle async streaming callbacks. This previously triggered errors due to a bug. - Prevent duplication of the last assistant message in the chat history when initializing from an
AgentSnapshot
. - We were setting
response_format
toNone
inOpenAIChatGenerator
by default which doesn't follow the API spec. We now omit the variable ifresponse_format
is not passed by the user. - Ensure that the
OpenAIChatGenerator
is properly serialized whenresponse_format
ingeneration_kwargs
is provided as a dictionary (for example,{"type": "json_object"}
). Previously, this caused serialization errors. - Fixed parameter schema generation in
ComponentTool
when usinginputs_from_state
. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example,inputs_from_state={"text": "text"}
removedtext
as expected, butinputs_from_state={"state_text": "text"}
did not. This is now resolved, and such cases work as intended. - Refactored
SentenceTransformersEmbeddingBackend
to ensure unique embedding IDs by incorporating all relevant arguments. - Fixed Agent to correctly raise a
BreakpointException
when aToolBreakpoint
with a specifictool_name
is provided in an assistant chat message containing multiple tool calls. - The
OpenAIChatGenerator
implementation usesChatCompletionMessageCustomToolCall
, which is only available in OpenAI client>=1.99.2
. We now requireopenai>=1.99.2
.
💙 Big thank you to everyone who contributed to this release!
@anakin87, @bilgeyucel, @davidsbatista, @dfokina, @Ryzhtus, @sjrl, @srini047, @tstadel, @vblagoje, @xoaryaa