What's Changed
Sandbox Agents
This release adds Sandbox Agents, a beta SDK surface for running agents with a persistent, isolated workspace. Sandbox agents keep the normal Agent and Runner flow, but add workspace manifests, sandbox-native capabilities, sandbox clients, snapshots, and resume support so agents can work over real files, run commands, edit repositories, generate artifacts, and continue work across runs.
Key pieces:
SandboxAgent: anAgentwith sandbox defaults such asdefault_manifest, sandbox instructions, capabilities, andrun_as.Manifest: a fresh-workspace contract for files, directories, local files, local directories, Git repos, environment, users, groups, and mounts.SandboxRunConfig: per-run sandbox wiring for client creation, live session injection, serialized session resume, manifest overrides, snapshots, and materialization concurrency limits.- Built-in capabilities for shell access, filesystem editing and image inspection, skills, memory, and compaction.
- Workspace snapshots and serialized sandbox session state for reconnecting to existing work or seeding a fresh sandbox from saved contents.
Sandbox clients and hosted providers
Sandbox agents now support local, containerized, and hosted execution backends:
UnixLocalSandboxClientfor fast local development.DockerSandboxClientfor container isolation and image parity.- Hosted sandbox clients for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel through optional extras.
The release also adds provider-specific examples and mount strategies for common storage backends, including S3, Cloudflare R2, Google Cloud Storage, Azure Blob Storage, and S3 Files where supported by the selected backend.
Sandbox memory
Adds a sandbox memory capability that lets future sandbox-agent runs learn from prior runs. Memory stores extracted lessons in the sandbox workspace, injects a concise summary into later runs, and uses progressive disclosure so agents can search deeper rollout summaries only when useful.
Memory supports:
- Read-only or generate-only modes.
- Live updates when the agent discovers stale memory.
- Multi-turn grouping through
conversation_id, SDKSession,RunConfig.group_id, or generated run IDs. - Separate memory layouts for isolating memory across agents or workflows.
- S3-backed examples for persisted memory across runs.
Workspace mounts, snapshots, and resume
This release adds a full workspace entry and mount model for sandbox sessions:
- Local files and directories.
- Synthetic files and directories.
- Git repository entries.
- Remote storage mounts for S3, R2, GCS, Azure Blob Storage, and S3 Files.
- Provider-specific mount strategies across Docker, Modal, Cloudflare, Blaxel, Daytona, E2B, and Runloop.
- Portable snapshots with path normalization, symlink preservation, mount-safe snapshotting, and remote snapshot support.
- Resume paths through runner-managed
RunState, explicitSandboxSessionState, or saved snapshots.
Examples and tutorials
Adds a large examples/sandbox/ suite covering:
- Local Unix and Docker sandbox runners.
- Docker mount smoke tests for S3, GCS, Azure Blob Storage, and S3 Files.
- Sandbox coding tasks with skills.
- Sandbox agents as tools and handoff patterns.
- Memory examples, including multi-agent/multi-turn memory and S3-backed memory.
- Tax-prep and healthcare-support workflows.
- Dataroom QA and metric extraction tutorials.
- Repository code review tutorial.
- Vision website clone tutorial.
- Provider examples for Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Temporal, and Vercel.
Runtime, tracing, and model plumbing
The release includes the runtime plumbing needed to make sandbox agents work naturally inside the existing SDK:
- Runner-managed sandbox preparation, capability binding, session lifecycle, state serialization, and resume behavior.
- Sandbox-aware
RunStateserialization. - Unified sandbox tracing with SDK spans.
- Token usage on tracing spans.
- Runner-managed prompt cache key defaults.
- OpenAI agent registration and harness ID configuration.
- Safer redaction of sensitive MCP tool outputs when sensitive tracing is disabled.
- Additional OpenAI client/model utilities and Chat Completions coverage.
Documentation & Other Changes
- docs: add Asqav to external tracing processors list.
- docs: update translated document pages.