github TencentCloud/CubeSandbox v0.3.0

4 hours ago

2026.06.02 Release v0.3.0

CubeSandbox 0.3.0 introduces CubeCoW, a Copy-on-Write snapshot engine that brings hundred-millisecond snapshot, clone, and rollback capabilities to AI Agent sandboxes. This release also adds the AgentHub digital assistant console (Preview), a Web UI for visual management, and the Go SDK. With 82 commits from 22 contributors, v0.3.0 is the largest release since open-sourcing.

🎯 Major Features

CubeCoW: Snapshot, Clone & Rollback

  • CubeCoW Copy-on-Write snapshot engine (#360): A full-lifecycle CoW snapshot engine using reflink-based volume snapshots, providing efficient block-level snapshot operations for sandbox volume management. Create lightweight, space-efficient checkpoints at any moment.
  • Soft-dirty incremental memory snapshots (#389): Per-cycle incremental memory snapshots via soft-dirty page tracking, dramatically reducing snapshot time and storage for repeated snapshot cycles. Only dirty pages are captured after the first full snapshot.
  • Snapshot restore & vsock handling (#388): VSOCK connections are properly reset on snapshot restore, ensuring clean connection state after rollback.
  • Snapshot I/O optimization (#400): Removed unnecessary sync_all() calls from all snapshot write paths, significantly reducing snapshot write latency without compromising data integrity.
  • Demo suite & developer guide (#374): A complete demo suite and step-by-step guide covering snapshot, rollback, and clone workflows, with runnable examples.
  • Host-mount pause snapshot restore fix: Fixed snapshot restore for sandboxes with paused host-mount filesystems.

AgentHub Digital Assistant Console (Preview)

  • AgentHub API & UI (#420): A complete digital assistant console built on top of CubeSandbox. Includes:
    • AgentHub persistence layer and assistant lifecycle management
    • OpenClaw setup integration for AI agent orchestration
    • Snapshot timeline with visual checkpoint creation and rollback
    • Clone sandboxes into parallel exploration environments
    • Template actions for reusable assistant configurations
    • Model settings and WeCom notification configuration
    • Full i18n support (English & Chinese)

Web UI

  • Management dashboard (#299): A browser-accessible Web UI for managing sandboxes, templates, and cluster nodes. Includes a template store for browsing and deploying pre-built sandbox images. No CLI needed for common operations.

🛠️ SDK

Python SDK (v0.2.1)

  • Template creation API (#365): Create sandbox templates programmatically from Python, enabling end-to-end automation without shell commands.
  • envd process API migration (#1676a0fc): Commands now run through the envd process API, improving reliability and consistency.
  • Process exit edge-case handling (#a210dfc2): Fixed edge cases in envd process lifecycle management, preventing hung commands.
  • Stderr coverage & file fallback hardening (#9e2c64ab): Improved error output capture and hardened file operation fallback paths.
  • envd defaults & network policy alignment (#418): Aligned envd service defaults and network policy configuration with the server-side defaults.

Go SDK (New)

  • Initial Go SDK release (#5de861ac, #3b5caf29): A complete Go SDK providing typed API bindings for sandbox lifecycle management, enabling Go applications to create, manage, and destroy sandboxes natively.

✨ Enhancements

Deployment

  • Systemd-based one-click deployment (#331): The one-click installer now manages all services through systemd, providing proper service supervision, automatic restart on failure, and systemctl integration.
  • Docker Compose container lifecycle (#386): Container lifecycle management migrated to Docker Compose, simplifying multi-container orchestration and improving restart behavior.
  • Early pre-download checks (#288): The online installer now validates network connectivity and disk space before downloading, preventing mid-installation failures.
  • Health check & diagnostic scripts (#305): New check.sh and collect-logs.sh scripts for one-click deployment health verification and log collection.
  • Cgroup v2 CPU controller preflight check (#367): The installer detects missing cgroup v2 CPU controller support and provides actionable guidance before proceeding.
  • Network-agent readiness wait (#304): The installer now waits for network-agent to be fully ready before proceeding, eliminating race conditions during initial setup.
  • Docker bind-mount directory prevention (#417): Prevent Docker from auto-creating directories at bind-mount file paths, ensuring correct mount behavior.
  • Guest image optimization (#347): Ext4 images are now shrunk after creation and the Dockerfile is optimized, reducing image size and pull time.

Infrastructure

  • Centralized schema migration (#385): CubeMaster now uses goose for database schema migrations, enabling versioned, automated schema management across upgrades.
  • Node resource reporting (#382): Cubelet now reports allocated node resources (CPU, memory, disk) to CubeMaster, enabling cluster-wide resource awareness.
  • Path-based sandbox routing (#334): CubeProxy supports path-based sandbox routing and shared backend resolution, improving routing flexibility.
  • Scheduler metrics (#326, #301): Cubelet exposes scheduler metrics as Prometheus gauges on /v1/metrics, enabling real-time monitoring of sandbox scheduling and resource utilization.

🐛 Bug Fixes

  • Pause/resume state convergence (#404): Fixed pause/resume state drift on ttrpc errors and shim events, ensuring consistent sandbox lifecycle state.
  • Shim readiness handshake (#398): Fixed the shim readiness handshake by not redirecting stdout (fd 1), preventing silent initialization failures.
  • Network resource leak (#314): Resolved a network resource leak during sandbox creation that could exhaust available network interfaces over time.
  • Host-mount cleanup (#333): Host-mount directories are now properly cleaned up after sandbox destruction, preventing disk space accumulation.
  • Cloud Hypervisor disk API (#337): Fixed incorrect Cloud Hypervisor disk API endpoint usage that could cause disk operation failures.
  • Template commit idempotency (#336): Enforced requestID uniqueness and added idempotent commit reuse, eliminating duplicate template commits from retried requests.
  • Config parsing (#396): Fixed NodeStatusUpdateFrequency to use tomlext.Duration for correct TOML duration parsing.
  • Input validation (#344): Added input validation at command-execution call sites to catch invalid parameters early.
  • Concurrent DNS handling (#363): Tolerate concurrent DNS dummy link creation, preventing race-condition failures during parallel sandbox creation.
  • PMEM boundary alignment (#351): Shrunk guest images are now aligned to pmem boundary, fixing boot failures on certain configurations.
  • Quickcheck readiness (#349): Wait for quickcheck containers to be ready before proceeding, eliminating false-positive health check failures.
  • Service binding security (#269): MySQL/Redis now bind to localhost by default, and CubeProxy uses host networking for improved network security.
  • Service startup ordering (#346): cube-proxy.service is now ordered after cube-sandbox-dns.service, preventing DNS resolution failures at startup.
  • Image digest handling (#303): Stripped canonical prefix from image digests in the template center, fixing image reference mismatches.
  • Paused state reporting (#270): Fixed paused sandbox state in list responses, ensuring accurate sandbox status display.
  • Build version injection (#327): Build version info is now properly injected via ldflags for cubelet and cubemaster binaries.

🔒 Security

  • Prometheus upgrade (#328): Upgraded prometheus client to 0.14.0, dropping the vulnerable protobuf 2.28.0 dependency.
  • reqwest upgrade (#323): Upgraded reqwest to 0.12 in CubeAPI, fixing the rustls-webpki CVE.
  • libseccomp upgrade (#321): Upgraded libseccomp to 0.3.0, fixing GHSA-2r23-gqr7-wr4h.
  • go-jose bump (#320): Bumped go-jose/v4 to the latest secure version.
  • gRPC dependency bump (#316): Updated gRPC dependency in CubeMaster.

📚 Documentation

  • Changelog restructure (#412, #416): Changelogs are now organized into per-version files with an index page for easier navigation. Fixed broken changelog links in README.
  • Performance benchmark blog (#419): Published a detailed performance benchmark post with reproducible bench scripts, covering startup latency and resource overhead metrics.
  • Blog system (#306, #340): Added a blog system with local search and maintainer guide. Published community posts including "From Serverless to Agent" and PVM deployment walkthroughs.
  • Brand identity (#329): Added official logo and favicon to the documentation site.
  • Troubleshooting guides (#313): New bilingual troubleshooting subpages for deployment and template creation issues.
  • Docs cross-reference fixes (#372): Added missing .md extensions to cross-file documentation references.
  • Example & tutorial fixes (#406, #407, #377): Fixed probe path in create-from-image tutorial, standardized placeholder API keys with e2b_ prefix, and corrected clone state documentation.

⚙️ Engineering Improvements

  • Kernel source migration (#395): Migrated kernel source from Gitee to CNB with enhanced extraction logic.
  • CI/CD hardening (#330, #335, #338, #393): Added docs build check workflow, fixed CR workflow, enabled auto-review for external PRs, added default shell configuration and artifact retrieval optimization.
  • Deprecated API removal (#339): Removed deprecated rand.Seed calls across the codebase.
  • Rust dependency refresh (#9f8df42f): Bumped crossbeam-channel from 0.5.13 to 0.5.15 in the hypervisor crate.

Full Changelog: v0.2.2...v0.3.0

Don't miss a new CubeSandbox release

NewReleases is sending notifications on new releases.