github strands-agents/sdk-python v1.21.0

4 days ago

Major Features

Custom HTTP Client Support for OpenAI and Gemini - PR#1366

The OpenAI and Gemini model providers now accept a pre-configured client via the client parameter, enabling connection pooling, proxy configuration, custom timeouts, and centralized observability across all model requests. The client is reused for all requests and its lifecycle is managed by the caller, not the model provider.

from strands.models.openai import OpenAIModel
import httpx

# Create custom client with proxy and timeout configuration
custom_client = httpx.AsyncClient(
    proxy="http://proxy.example.com:8080",
    timeout=60.0
)

model = OpenAIModel(model_id="gpt-4o-mini", client=custom_client)

Gemini Built-in Tools (Google Search, Code Execution) - PR#1050

Gemini models now support native Google tools like GoogleSearch and CodeExecution through the gemini_tools parameter. These tools integrate directly with Gemini's API without requiring custom function implementations, enabling agents to search the web and execute code natively.

from strands.models.gemini import GeminiModel
from google.genai import types

model = GeminiModel(
    model_id="gemini-2.0-flash-exp",
    gemini_tools=[types.Tool(google_search=types.GoogleSearch())]
)

Hook-based Model Retry on Exceptions - PR#1405

Hooks can now retry model invocations by setting event.retry = True in the AfterModelCallEvent handler, enabling custom retry logic for transient errors, rate limits, or quality checks. This provides fine-grained control over model retry behavior beyond basic exception handling.

class RetryOnServiceUnavailable(HookProvider):
    def __init__(self, max_retries=3):
        self.max_retries = max_retries
        self.retry_count = 0

    def register_hooks(self, registry, **kwargs):
        registry.add_callback(BeforeInvocationEvent, self.reset_counts)
        registry.add_callback(AfterModelCallEvent, self.handle_retry)

    def reset_counts(self, event = None):
      	self.retry_count = 0

    async def handle_retry(self, event):
        if event.exception:
            if "ServiceUnavailable" in str(event.exception):
                logger.info("ServiceUnavailable encountered")
                count = self.retry_count
                if count < self.max_retries:
                    logger.info("Retrying model call")
                    self.retry_count = count + 1
                    event.retry = True
                    await asyncio.sleep(2 ** count)  # Exponential backoff
        else:
            # reset counts in the succesful case
            self.reset_counts(None)

Per-Turn Conversation Management - PR#1374

Conversation managers now support mid-execution management via the per_turn parameter, applying conversation window management before each model call rather than only at the end. This prevents context window overflow during multi-turn conversations with tools or long responses.

from strands import Agent
from strands.agent.conversation_manager import SlidingWindowConversationManager

# Enable management before every model call
manager = SlidingWindowConversationManager(
    per_turn=True,
    window_size=40
)

# Or manage every N turns
manager = SlidingWindowConversationManager(
    per_turn=3,  # Manage every 3 model calls
    window_size=40
)

agent = Agent(model=model, conversation_manager=manager)

Agent Invocation Metrics - PR#1387

Metrics now track per-invocation data through agent_invocations and latest_agent_invocation properties, providing granular insight into each agent call's performance, token usage, and execution time. This enables detailed performance analysis for multi-invocation workflows.

from strands import Agent

agent = Agent(model=model)
result = agent("Analyze this data")

# Access invocation-level metrics
latest = result.metrics.latest_agent_invocation
print(f"Cycles: {len(latest.cycles)}")
print(f"Tokens: {latest.usage}")

# Access all invocations
for invocation in result.metrics.agent_invocations:
    print(f"Invocation usage: {invocation.usage}")

ToolRegistry Replace Method - PR#1182

The ToolRegistry now supports replacing existing tools via the replace() method, enabling dynamic tool updates without re-registering all tools. This is particularly useful for hot-reloading tool implementations or updating tools based on runtime conditions.

from strands.tools.registry import ToolRegistry
from strands import tool

registry = ToolRegistry()

@tool
def calculate(x: int, y: int) -> int:
    """Calculate sum."""
    return x + y

registry.register_tool(calculate)

# Later, replace with updated implementation
@tool
def calculate(x: int, y: int) -> int:
    """Calculate product (new implementation)."""
    return x * y

registry.replace(calculate)

Web and Search Result Citations - PR#1344

Citations now support web locations (WebLocation) and search result positions (SearchResultLocation), enabling agents to reference specific URLs and search result indices when grounding responses in retrieved information.

from strands.types.citations import WebLocation, SearchResultLocation

# Create web location citation
web_citation: WebLocation = {
    "url": "https://docs.example.com/api",
    "domain": "docs.example.com"
}

# Create search result citation
search_citation: SearchResultLocation = {
    "searchResultIndex": 0,
    "start": 150,
    "end": 320
}

A2A Server FastAPI/Starlette Customization - PR#1250

A2A servers now support passing additional kwargs to FastAPI and Starlette app constructors via app_kwargs, enabling customization of documentation URLs, debug settings, middleware, and other application-level configuration.

from strands.multiagent.a2a import A2AServer

server = A2AServer(agent)

# Customize FastAPI app
app = server.to_fastapi_app(
    app_kwargs={
        "title": "My Custom Agent API",
        "docs_url": None,  # Disable docs
        "redoc_url": None  # Disable redoc
    }
)

# Customize Starlette app
app = server.to_starlette_app(
    app_kwargs={"debug": True}
)

Major Bug Fixes

  • Citation Streaming and Union Type Fix - PR#1341
    Fixed CitationLocation to properly handle union types and correctly join citation chunks during streaming, resolving issues where citations were malformed or lost in streamed responses.

  • Usage Metrics Double Counting - PR#1327
    Fixed telemetry double counting of usage metrics, ensuring accurate token usage tracking and preventing inflated cost calculations.

  • Tools Returning Image Content - PR#1079
    Fixed OpenAI model provider to properly support tools that return image content, enabling vision-capable tools to work correctly.

  • Deprecation Warning Timing - PR#1380
    Fixed deprecation warnings to only emit when deprecated aliases are actually accessed, eliminating spurious warnings for users not using deprecated features.


What's Changed

  • Add issue-responder action agent by @afarntrog in #1319
  • feat(a2a): support passing additional keyword arguments to FastAPI and Starlette constructors by @snooyen in #1250
  • feat(tools): add replace method to ToolRegistry by @Ratish1 in #1182
  • feat: add meta field support to MCP tool results by @vamgan in #1237
  • fix: remove unnecessary None from dict.get() calls by @Ratish1 in #956
  • chore: Expose Status from .base for easier imports by @zastrowm in #1356
  • fix: CitationLocation is UnionType, and correctly joining citation chunks when streaming is being used by @ericfzhu in #1341
  • fix(telemetry): prevent double counting of usage metrics by @rajib76 in #1327
  • feat(citations): Add support for web and search result citations by @danilop in #1344
  • feat: add gemini_tools field to GeminiModel with validation and tests by @pshiko in #1050
  • Port PR guidelines from sdk-typescript by @zastrowm in #1373
  • feat: allow custom-client for OpenAIModel and GeminiModel by @poshinchen in #1366
  • fix: Pass CODECOV_TOKENS through for code-coverage stats by @zastrowm in #1385
  • ci: bump actions/checkout from 5 to 6 by @dependabot[bot] in #1222
  • ci: update pytest-asyncio requirement from <1.3.0,>=1.0.0 to >=1.0.0,<1.4.0 by @dependabot[bot] in #1166
  • ci: bump actions/upload-artifact from 4 to 6 by @dependabot[bot] in #1332
  • ci: bump actions/download-artifact from 5 to 7 by @dependabot[bot] in #1333
  • ci: update pre-commit requirement from <4.4.0,>=3.2.0 to >=3.2.0,<4.6.0 by @dependabot[bot] in #1242
  • feat: add api check to github workflow by @JackYPCOnline in #1348
  • ci: bump aws-actions/configure-aws-credentials from 4 to 5 by @dependabot[bot] in #1352
  • ci: update ruff requirement from <0.14.0,>=0.13.0 to >=0.13.0,<0.15.0 by @dependabot[bot] in #1004
  • feat: add per_turn parameter to SlidingWindowConversationManager by @zastrowm in #1374
  • fix: check api breaking change against main by @JackYPCOnline in #1397
  • ci: bump astral-sh/setup-uv from 6 to 7 by @dependabot[bot] in #1390
  • fix(openai): support tools returning image content by @Ratish1 in #1079
  • feat: added agent_invocations by @poshinchen in #1387
  • ci: bump actions/checkout from 5 to 6 by @dependabot[bot] in #1389
  • Port TypeScript agents into Python by @zastrowm in #1403
  • feat: allow hooks to retry model invocations on exceptions by @zastrowm in #1405
  • fix: emit deprecation warning only when deprecated aliases are accessed by @jsamuel1 in #1380

New Contributors

Full Changelog: v1.20.0...v1.21.0

Don't miss a new sdk-python release

NewReleases is sending notifications on new releases.