github deepset-ai/haystack v2.23.0-rc1

pre-release8 hours ago

Release Notes

v2.23.0-rc1

Upgrade Notes

  • Pipeline snapshot file saving is now disabled by default. You must now explicitly enable it by setting the environment variable HAYSTACK_PIPELINE_SNAPSHOT_SAVE_ENABLED=true.
  • Remove backward-compatibility support for deserializing pipeline snapshots with the old pipeline_outputs format. Pipeline snapshots created before Haystack 2.22.0 that contain pipeline_outputs without the serialization_schema and serialized_data structure are no longer supported. Users should recreate their pipeline snapshots with the current Haystack version before upgrading to 2.23.0.
  • The return_empty_on_no_match parameter has been fully removed from the RegexTextExtractor component. In Haystack 2.22.0, this parameter was ignored. Starting with Haystack 2.23.0, passing this parameter during component initialization will raise an error. During pipeline deserialization, the parameter is ignored to avoid breaking existing pipelines.

New Features

  • Added support for human confirmation on Agent tool calls. You can now configure per-tool confirmation strategies when building an Agent, including always requiring confirmation, never requiring it, or requesting confirmation only on first use. The confirmation experience is customizable through different UI strategies, allowing fine-grained control over how and when users approve tool executions.

    agent = Agent(
        chat_generator=OpenAIChatGenerator(model="gpt-4.1"),
        tools=[balance_tool, addition_tool, phone_tool],
        system_prompt=(
            "You are a helpful financial assistant. "
            "Use the provided tool to get bank balances when needed."
        ),
        confirmation_strategies={
            balance_tool.name: BlockingConfirmationStrategy(
                confirmation_policy=AlwaysAskPolicy(),
                confirmation_ui=RichConsoleUI(console=cons),
            ),
            addition_tool.name: BlockingConfirmationStrategy(
                confirmation_policy=NeverAskPolicy(),
                confirmation_ui=SimpleConsoleUI(),
            ),
            phone_tool.name: BlockingConfirmationStrategy(
                confirmation_policy=AskOncePolicy(),
                confirmation_ui=SimpleConsoleUI(),
            ),
        },
    )
  • Added a snapshot_callback parameter to Pipeline.run() that allows users to customize how pipeline snapshots are handled. When a callback is provided, it is invoked instead of the default file-saving behavior whenever a snapshot is created (e.g., during breakpoints or error handling). This enables use cases like saving snapshots to a database, sending them to a remote service, or implementing custom logging. If no callback is provided, the default behavior of saving to a JSON file remains unchanged.

  • component_from_dict() and component_to_dict() now work with custom components out of the box also if the component has a ComponentDevice as an attribute. Users no longer need to explicitly define to_dict() and from_dict() methods in their custom components to call ComponentDevice.from_dict() or device.to_dict(). component_from_dict() and component_to_dict() now handle this automatically.

  • component_from_dict/component_to_dict now work out of the box with custom components that have an object as init parameter as long as the object defines to_dict/from_dict methods. Users no longer need to explicitly define to_dict/from_dict methods in their custom components in such cases. For example, a custom retriever, which has a DocumentStore as an init parameter, does not need explicitly defined to_dict/from_dict methods. component_from_dict/component_to_dict now handle such cases automatically.

  • component_from_dict() and component_to_dict() now work with custom components out of the box also if the component has a Secret as an attribute. Users no longer need to explicitly define to_dict() and from_dict() methods in their custom components to call deserialize_secrets_inplace() or api_key.to_dict(). component_from_dict() and component_to_dict() now handle this automatically.

  • Added HAYSTACK_PIPELINE_SNAPSHOT_SAVE_ENABLED environment variable. When set to "true" or "1", pipeline snapshots are saved to disk. Disabled by default. Note: Custom snapshot_callback functions are still invoked regardless of this setting.

  • Expanded the ToolCallResult.result field to accept not only strings but also lists of TextContent and ImageContent objects. This enables tools to return images for providers that support this capability. This feature is already available when using OpenAIResponsesChatGenerator, and support for additional providers will be added soon. The Chat Completions API does not support this functionality, so the classic OpenAIChatGenerator cannot be used with it.

  • The outputs_to_string parameter of the Tool class now supports returning raw results without string conversion using the raw_result key. This is intended for tools that return images. ComponentTool and PipelineTool also support this feature.

    Here is an example of how to use it:

    from haystack.components.agents import Agent
    from haystack.components.generators.chat import OpenAIResponsesChatGenerator
    from haystack.dataclasses import ChatMessage, ImageContent, TextContent
    from haystack.tools import create_tool_from_function
    def retrieve_image():
        """Tool to retrieve an image"""
        return [
            TextContent("Here is the retrieved image."),
            ImageContent.from_file_path("test/test_files/images/apple.jpg"),
        ]
    
    
    image_retriever_tool = create_tool_from_function(
        function=retrieve_image, outputs_to_string={"raw_result": True}
    )
    
    agent = Agent(
        chat_generator=OpenAIResponsesChatGenerator(model="gpt-5-nano"),
        system_prompt="You are an Agent that can retrieve images and describe them.",
        tools=[image_retriever_tool],
    )
    
    user_message = ChatMessage.from_user("Retrieve the image and describe it in max 10 words.")
    result = agent.run(messages=[user_message])
    
    print(result["last_message"].text)
    # Red apple with stem resting on straw.

Enhancement Notes

  • Added haystack.component.fully_qualified_type field to component tracing output. This new field provides the full module path and class name (e.g., haystack.components.generators.chat.openai.OpenAIChatGenerator) alongside the existing haystack.component.type field that only contains the class name. This enables dynamic component loading and better tooling integration.

  • In OpenAIChatGenerator, streaming now handles cases where a ChatCompletionChunk has a delta set to None in choices. This can occur with some OpenAI-compatible providers, and the component will now handle it gracefully.

  • Components no longer handle the (de-)serialization of ComponentDevice explicitly. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Components no longer handle the (de-)serialization of init parameter objects explicitly if the objects define to_dict/from_dict themselves. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Components no longer handle the (de-)serialization of Secrets explicitly. Instead, the components rely on the behavior implemented in default_to_dict/default_from_dict.

  • Support for flattened generation_kwargs in OpenAIResponsesChatGenerator

    The OpenAIResponsesChatGenerator component now supports flattened generation keyword arguments, allowing users to specify reasoning parameters directly without nesting them. This enhancement simplifies the configuration process and improves usability.

    Example:

    from haystack.components.generators.chat import OpenAIResponsesChatGenerator
    generator = OpenAIResponsesChatGenerator(
        model="gpt-4",
        generation_kwargs={
            "reasoning_effort": "low",
            "reasoning_summary": "auto"
        }
    )
  • Support for flattened verbosity in generation_kwargs of OpenAIResponsesChatGenerator

    The OpenAIResponsesChatGenerator component now supports flattened verbosity generation keyword arguments, allowing users to specify verbositydirectly without nesting them in text. This enhancement simplifies the configuration process and improves usability.

    Example:

    from haystack.components.generators.chat import OpenAIResponsesChatGenerator
    generator = OpenAIResponsesChatGenerator(
        model="gpt-4",
        generation_kwargs={
            "verbosity": "low",
        }
    )
  • Added the outputs_to_string parameter to create_tool_from_function and the @tool decorator to provide additional customization options for these convenience constructors.

Deprecation Notes

  • deserialize_document_store_in_init_params_inplace is deprecated and will be removed in Haystack version 2.24. It is no longer used internally and should not be used in new code. The deserialization of DocumentStores is handled automatically now by default_from_dict.

Bug Fixes

  • Fix ComponentTool, create_tool_from_function, and the @tool decorator failing to create a tool schema when Callable type parameters are present (such as snapshot_callback). This enables using Agent as a ComponentTool without raising SchemaGenerationError.
  • Fixed a bug in OpenAIResponsesChatGenerator where empty reasoning items were discarded during streaming. This caused subsequent requests to the OpenAI Responses API to fail when the message history was sent back, as the API requires every tool call to be preceded by its associated reasoning item. These items are now correctly preserved in the ChatMessage history, even when the summary text is empty.
  • Fix SASEvaluator to work when using numpy>=2.4 by manually squeezing a PyTorch tensor to the correct dimension.
  • Fixed usage info extraction in streaming responses for OpenAI-compatible chat generators. Previously, usage was only extracted from the last chunk, which failed for providers like Qwen3 that return usage in a different chunk. Now all chunks are searched to find the usage information.
  • Fix incorrect type resolution for super component inputs when mapped components use optional types. The compatibility check now correctly keeps optional types when combining inputs from multiple components, instead of dropping the optional part and producing a non optional type. For example, going from dict[str, Any] | None to dict[str, Any]. Instead we now keep the None in the final type for the super component.
  • Use create instead of parse method of the OpenAI Responses python client when text is set but no text_format is set. parse requires a type from text_format to be set, to actually parse the response into that type. With text set but no text_format, we just want to create a normal response without parsing.
  • Fixed a bug in Tool serialization. Previously, when outputs_to_string used the multiple-output format, handlers were not serialized correctly.

💙 Big thank you to everyone who contributed to this release!

@agnieszka-m, @anakin87, @bilgeyucel, @chenopis, @GunaPalanivel, @julian-risch, @majiayu000, @mpangrazzi, @sjrl, @tstadel

Don't miss a new haystack release

NewReleases is sending notifications on new releases.