strands-agents/sdk-python v1.21.0 on GitHub

Major Features

Custom HTTP Client Support for OpenAI and Gemini - PR#1366

The OpenAI and Gemini model providers now accept a pre-configured client via the client parameter, enabling connection pooling, proxy configuration, custom timeouts, and centralized observability across all model requests. The client is reused for all requests and its lifecycle is managed by the caller, not the model provider.

from strands.models.openai import OpenAIModel
import httpx

# Create custom client with proxy and timeout configuration
custom_client = httpx.AsyncClient(
    proxy="http://proxy.example.com:8080",
    timeout=60.0
)

model = OpenAIModel(model_id="gpt-4o-mini", client=custom_client)

Gemini Built-in Tools (Google Search, Code Execution) - PR#1050

Gemini models now support native Google tools like GoogleSearch and CodeExecution through the gemini_tools parameter. These tools integrate directly with Gemini's API without requiring custom function implementations, enabling agents to search the web and execute code natively.

from strands.models.gemini import GeminiModel
from google.genai import types

model = GeminiModel(
    model_id="gemini-2.0-flash-exp",
    gemini_tools=[types.Tool(google_search=types.GoogleSearch())]
)

Hook-based Model Retry on Exceptions - PR#1405

Hooks can now retry model invocations by setting event.retry = True in the AfterModelCallEvent handler, enabling custom retry logic for transient errors, rate limits, or quality checks. This provides fine-grained control over model retry behavior beyond basic exception handling.

class RetryOnServiceUnavailable(HookProvider):
    def __init__(self, max_retries=3):
        self.max_retries = max_retries
        self.retry_count = 0

    def register_hooks(self, registry, **kwargs):
        registry.add_callback(BeforeInvocationEvent, self.reset_counts)
        registry.add_callback(AfterModelCallEvent, self.handle_retry)

    def reset_counts(self, event = None):
      	self.retry_count = 0

    async def handle_retry(self, event):
        if event.exception:
            if "ServiceUnavailable" in str(event.exception):
                logger.info("ServiceUnavailable encountered")
                count = self.retry_count
                if count < self.max_retries:
                    logger.info("Retrying model call")
                    self.retry_count = count + 1
                    event.retry = True
                    await asyncio.sleep(2 ** count)  # Exponential backoff
        else:
            # reset counts in the succesful case
            self.reset_counts(None)

Per-Turn Conversation Management - PR#1374

Conversation managers now support mid-execution management via the per_turn parameter, applying conversation window management before each model call rather than only at the end. This prevents context window overflow during multi-turn conversations with tools or long responses.

from strands import Agent
from strands.agent.conversation_manager import SlidingWindowConversationManager

# Enable management before every model call
manager = SlidingWindowConversationManager(
    per_turn=True,
    window_size=40
)

# Or manage every N turns
manager = SlidingWindowConversationManager(
    per_turn=3,  # Manage every 3 model calls
    window_size=40
)

agent = Agent(model=model, conversation_manager=manager)

Agent Invocation Metrics - PR#1387

Metrics now track per-invocation data through agent_invocations and latest_agent_invocation properties, providing granular insight into each agent call's performance, token usage, and execution time. This enables detailed performance analysis for multi-invocation workflows.

from strands import Agent

agent = Agent(model=model)
result = agent("Analyze this data")

# Access invocation-level metrics
latest = result.metrics.latest_agent_invocation
print(f"Cycles: {len(latest.cycles)}")
print(f"Tokens: {latest.usage}")

# Access all invocations
for invocation in result.metrics.agent_invocations:
    print(f"Invocation usage: {invocation.usage}")

ToolRegistry Replace Method - PR#1182

The ToolRegistry now supports replacing existing tools via the replace() method, enabling dynamic tool updates without re-registering all tools. This is particularly useful for hot-reloading tool implementations or updating tools based on runtime conditions.

from strands.tools.registry import ToolRegistry
from strands import tool

registry = ToolRegistry()

@tool
def calculate(x: int, y: int) -> int:
    """Calculate sum."""
    return x + y

registry.register_tool(calculate)

# Later, replace with updated implementation
@tool
def calculate(x: int, y: int) -> int:
    """Calculate product (new implementation)."""
    return x * y

registry.replace(calculate)

Web and Search Result Citations - PR#1344

Citations now support web locations (WebLocation) and search result positions (SearchResultLocation), enabling agents to reference specific URLs and search result indices when grounding responses in retrieved information.

from strands.types.citations import WebLocation, SearchResultLocation

# Create web location citation
web_citation: WebLocation = {
    "url": "https://docs.example.com/api",
    "domain": "docs.example.com"
}

# Create search result citation
search_citation: SearchResultLocation = {
    "searchResultIndex": 0,
    "start": 150,
    "end": 320
}

A2A Server FastAPI/Starlette Customization - PR#1250

A2A servers now support passing additional kwargs to FastAPI and Starlette app constructors via app_kwargs, enabling customization of documentation URLs, debug settings, middleware, and other application-level configuration.

from strands.multiagent.a2a import A2AServer

server = A2AServer(agent)

# Customize FastAPI app
app = server.to_fastapi_app(
    app_kwargs={
        "title": "My Custom Agent API",
        "docs_url": None,  # Disable docs
        "redoc_url": None  # Disable redoc
    }
)

# Customize Starlette app
app = server.to_starlette_app(
    app_kwargs={"debug": True}
)

Major Bug Fixes

Citation Streaming and Union Type Fix - PR#1341
Fixed CitationLocation to properly handle union types and correctly join citation chunks during streaming, resolving issues where citations were malformed or lost in streamed responses.
Usage Metrics Double Counting - PR#1327
Fixed telemetry double counting of usage metrics, ensuring accurate token usage tracking and preventing inflated cost calculations.
Tools Returning Image Content - PR#1079
Fixed OpenAI model provider to properly support tools that return image content, enabling vision-capable tools to work correctly.
Deprecation Warning Timing - PR#1380
Fixed deprecation warnings to only emit when deprecated aliases are actually accessed, eliminating spurious warnings for users not using deprecated features.

What's Changed

Add issue-responder action agent by @afarntrog in #1319
feat(a2a): support passing additional keyword arguments to FastAPI and Starlette constructors by @snooyen in #1250
feat(tools): add replace method to ToolRegistry by @Ratish1 in #1182
feat: add meta field support to MCP tool results by @vamgan in #1237
fix: remove unnecessary None from dict.get() calls by @Ratish1 in #956
chore: Expose Status from .base for easier imports by @zastrowm in #1356
fix: CitationLocation is UnionType, and correctly joining citation chunks when streaming is being used by @ericfzhu in #1341
fix(telemetry): prevent double counting of usage metrics by @rajib76 in #1327
feat(citations): Add support for web and search result citations by @danilop in #1344
feat: add gemini_tools field to GeminiModel with validation and tests by @pshiko in #1050
Port PR guidelines from sdk-typescript by @zastrowm in #1373
feat: allow custom-client for OpenAIModel and GeminiModel by @poshinchen in #1366
fix: Pass CODECOV_TOKENS through for code-coverage stats by @zastrowm in #1385
ci: bump actions/checkout from 5 to 6 by @dependabot[bot] in #1222
ci: update pytest-asyncio requirement from <1.3.0,>=1.0.0 to >=1.0.0,<1.4.0 by @dependabot[bot] in #1166
ci: bump actions/upload-artifact from 4 to 6 by @dependabot[bot] in #1332
ci: bump actions/download-artifact from 5 to 7 by @dependabot[bot] in #1333
ci: update pre-commit requirement from <4.4.0,>=3.2.0 to >=3.2.0,<4.6.0 by @dependabot[bot] in #1242
feat: add api check to github workflow by @JackYPCOnline in #1348
ci: bump aws-actions/configure-aws-credentials from 4 to 5 by @dependabot[bot] in #1352
ci: update ruff requirement from <0.14.0,>=0.13.0 to >=0.13.0,<0.15.0 by @dependabot[bot] in #1004
feat: add per_turn parameter to SlidingWindowConversationManager by @zastrowm in #1374
fix: check api breaking change against main by @JackYPCOnline in #1397
ci: bump astral-sh/setup-uv from 6 to 7 by @dependabot[bot] in #1390
fix(openai): support tools returning image content by @Ratish1 in #1079
feat: added agent_invocations by @poshinchen in #1387
ci: bump actions/checkout from 5 to 6 by @dependabot[bot] in #1389
Port TypeScript agents into Python by @zastrowm in #1403
feat: allow hooks to retry model invocations on exceptions by @zastrowm in #1405
fix: emit deprecation warning only when deprecated aliases are accessed by @jsamuel1 in #1380

New Contributors

@snooyen made their first contribution in #1250
@ericfzhu made their first contribution in #1341
@rajib76 made their first contribution in #1327
@danilop made their first contribution in #1344
@pshiko made their first contribution in #1050
@jsamuel1 made their first contribution in #1380

Full Changelog: v1.20.0...v1.21.0