github kubernetes-sigs/agent-sandbox v0.5.0

7 hours ago

🚀 Announcing Agent Sandbox v0.5.0!

We're excited to announce the release of Agent Sandbox v0.5.0! This release marks a significant milestone with the official graduation of our APIs to v1beta1, bringing enhanced stability, critical security hardening, and a wealth of new features and improvements across the platform, client SDKs, and examples. Dive in to experience a more robust and developer-friendly Agent Sandbox.

⚠️ Breaking Changes / Action Required

  • API Group Upgrade and Deprecation (v1alpha1 to v1beta1):
    • The core and extension APIs (agents.x-k8s.io and extensions.agents.x-k8s.io) have been officially graduated from v1alpha1 to v1beta1.
    • v1alpha1 APIs are now deprecated. While multi-version CRD support is introduced with a conversion webhook, users are strongly encouraged to migrate their v1alpha1 resources to v1beta1.
    • Action Required: Update your manifests and API interactions to use apiVersion: agents.x-k8s.io/v1beta1 and apiVersion: extensions.agents.x-k8s.io/v1beta1. Refer to the API Migration Guide for detailed steps.
  • Sandbox spec.replicas Removed, spec.operatingMode Introduced:
    • The spec.replicas field has been removed from the Sandbox API and replaced with spec.operatingMode (with values Running and Suspended).
    • This is a breaking change for any automation or tools that relied on spec.replicas for scaling (e.g., kubectl scale, HorizontalPodAutoscalers, PodDisruptionBudgets).
    • Action Required: Update your Sandbox manifests and any scaling logic to use spec.operatingMode for managing Sandbox lifecycle.
  • SandboxClaim spec.templateRef Replaced by spec.warmpoolRef:
    • The SandboxClaim API no longer uses spec.templateRef or the warmpool policy field. Instead, claims must explicitly point to a SandboxWarmPool using spec.warmpoolRef.
    • To achieve a cold start without pre-warming, cluster administrators should create a SandboxWarmPool with replicas: 0 for users to reference.
    • Action Required: Update SandboxClaim manifests to reference spec.warmpoolRef pointing to an existing SandboxWarmPool resource.
  • NetworkPolicy Namespace Restriction for sandbox-router:
    • The default NetworkPolicy generated by the SandboxTemplate controller now strictly scopes ingress rules to the agent-sandbox-system namespace for the sandbox-router.
    • Action Required: If your deployments are running the sandbox-router in a namespace other than agent-sandbox-system, you must migrate and deploy the sandbox-router inside agent-sandbox-system prior to or in tandem with upgrading the controller to avoid service interruption.

Key Highlights

Core API & Platform Stability

  • API Graduation & Multi-Version Support: Official graduation of core and extension APIs to v1beta1, including multi-version CRD support with conversion webhooks for v1alpha1 compatibility during migration (#817, #993).
  • Sandbox Lifecycle Management: Replaced spec.replicas with spec.operatingMode for more explicit control over Sandbox suspension and resume behavior (#801).
  • SandboxClaim Enhancements: SandboxClaim now uses spec.warmpoolRef for clearer warm pool association and gained printer columns for improved kubectl get visibility (#899, #984).
  • Optimized Warm Pool Operations: Enabled parallel creation and deletion of sandboxes within SandboxWarmPool controller, significantly speeding up scale operations (#798).
  • Improved Warm Pool Selection Strategy: Implemented a smart warm pool selection strategy that prioritizes ready sandboxes, spreads workloads across nodes, and optimizes for in-memory processing, reducing API overhead (#878, #939).
  • Resource Adoption & Persistence: Fixed orphan adoption for Sandbox child resources and introduced explicit authorization for unowned resources to prevent hijacking (#944, #784).
  • Performance Improvements: Switched SandboxClaim status updates to patching (.Patch()) to reduce conflicts at scale, improving overall system performance (#508).
  • Helm Chart Enhancements: Added support for podSecurityContext, containerSecurityContext, podAnnotations, and podLabels in the controller Helm chart for better Kubernetes policy compliance and custom metadata injection (#753, #750).
  • Storage Configuration via SandboxClaim: Introduced support for volume claim templates within SandboxClaims, enabling customized persistent volumes with policy-driven merging (#960).
  • Warmpool Label Propagation: Enhanced warmpool label propagation from sandbox to pod, ensuring consistent identification across resources (#927).
  • Preserve Zero Replica Counts: Fixed an issue where zero replica counts in warmpool status were not preserved during server-side apply operations (#807).
  • Assigned Sandbox Name Storage: Switched to storing assigned Sandbox names in annotations instead of labels to bypass Kubernetes length constraints (#771).

Security & Hardening

  • SSRF Protections: Disabled automatic HTTP redirects in both Go and Python SDKs to prevent Server-Side Request Forgery (SSRF) vulnerabilities from untrusted sandbox workloads (#874, #816).
  • Router Security: Addressed an unauthenticated internal proxy vulnerability in the sandbox router with strict input validation and optional bearer token authentication (#755).
  • Network Policy Enhancements: Default NetworkPolicy now blocks IPv6 link-local traffic and strictly scopes ingress to the agent-sandbox-system namespace for the sandbox-router for enhanced isolation (#827, #881).
  • Build-time Injection Prevention: Sanitized git-derived version strings to prevent build-time command injection vulnerabilities (#946).
  • Denial of Service (ReDoS) Fix: Replaced a vulnerable regex matching function with an iterative dynamic programming approach to resolve a ReDoS vulnerability (#935).
  • Pod Metadata Protection: Protected system-reserved Pod labels and annotations from tenant override to prevent traffic hijacking or tracking label forging (#894).
  • Warm Pool Poisoning Prevention: isAdoptable function now explicitly rejects unowned sandboxes to prevent warm pool poisoning (#875).
  • OpenTelemetry Trace Sanitization: Sanitized sandbox.command attribute in OpenTelemetry traces to prevent sensitive data exposure (#895).
  • CLI Tool Hardening: Fixed concurrency race conditions and stale PID cleanup issues in resourcectl CLI utility, preventing data loss and arbitrary process termination (#934, #902).

Client SDK & Developer Experience

  • Dynamic Timeout Propagation: SDKs now support dynamic timeout propagation to the sandbox router, ensuring long-running operations are not prematurely terminated (#857).
  • Python Async Client Cleanup: Added cleanup=True support to AsyncSandboxClient for automatic resource cleanup on program termination (#859).
  • Python additionalPodMetadata Exposure: Exposed additionalPodMetadata in the Python client for direct control over Sandbox Pod labels and annotations (#979).
  • Go Client PodIP Routing: Enabled PodIP routing in the Go client to resolve connection issues when Kubernetes DNS is unavailable for sandbox services (#910).
  • Sandbox Client Improvements: Hardened filesystem path sanitization, improved label selectors, and enabled template-verified reattachment in the Python SDK (#695).
  • PSS SDK Enhancements: Enabled restoration from dedicated snapshots and filtering by creation timestamp for the Python Snapshot SDK (#799, #732).
  • CI/CD & Tooling: Optimized CI staging builds, increased promotion timeouts, and updated pyyaml dependency for CRD sorting during release publish (#1021). Improved AI code review configuration and guidelines for Copilot and CodeRabbit (#938, #936, #947, #866).

Examples & Documentation

  • RL & Evals Example: Introduced agent-sandbox-rl, a complete Python package for multi-cluster warm-pool orchestration of RL and Evals workloads (#1000).
  • Anthropic Agents Example: Added an example for running Anthropic Managed Agents self-hosted sandboxes on GKE Agent Sandbox (#950).
  • Sandboxed Tools Enhancements: Improved sandboxed-tools examples to persist sessions and filesystem state across multiple tool calls and refactored tools into their own package (#888, #887, #877, #886).
  • MCP Server Example: Provided an example for running an MCP server inside a sandbox with persistent storage (#937).
  • AKS Kata Container Example: Added an AKS example demonstrating Kata Containers with sandbox warm pools (#839).
  • Ray Integration: Documented an example on how to run a RayJob with Agent Sandbox via direct PodIP (#868, #742).
  • Comprehensive Troubleshooting Guide: Added a detailed troubleshooting guide for debugging SDK, custom image, and cluster-level issues (#660).
  • API & NetworkPolicy Documentation: Updated documentation to reflect v1beta1 API changes, clarified NodeLocal DNS walkthrough, and expanded NetworkPolicy guidance (#867, #823, #815).
  • Issue Templates: Added structured GitHub issue templates for bug reports, feature requests, and epics, and improved their ordering (#880, #891).

Installation

Core & Extensions

# To install only the core components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0/manifest.yaml

# To install the extensions components:
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.5.0/extensions.yaml

To upgrade from v0.4.6 to v0.5.0, please follow the detailed steps in API Migration Guide.

Python SDK

pip install k8s-agent-sandbox==0.5.0

Contributors

We extend our sincere thanks to all contributors to this release:
@AlexBulankou, @ArthurKamalov, @HasonoCell, @RidPra, @SHRUTI6991, @aditya-shantanu, @aleks-stefanovic, @alimx07, @app/dependabot, @armistcxy, @arpitjain099, @chw120, @hrsh1209, @ianchakeres, @inardini, @janetkuo, @justinsb, @kannon92, @lauragalbraith, @mesutoezdil, @moficodes, @mvanhorn, @patcrombie, @rainwoodman, @rmalani-nv, @ryanzhang-oss, @sairajp-rewind, @shaikenov, @shelwinnn, @shrutiyam-glitch, @sohanpatil, @tom1299, @tomergee, @vicentefb, @volatilemolotov, @zhzhuang-zju

New Contributors

Full Changelog: v0.4.6...v0.5.0

Don't miss a new agent-sandbox release

NewReleases is sending notifications on new releases.