Releases · volcano-sh/agentcube

Summary

AgentCube v0.1.0 is the first official release of AgentCube, a Volcano subproject that extends Kubernetes with native support for AI agent and code interpreter workloads. This release establishes the foundational architecture: a lightweight HTTP reverse proxy (Router) routes agent invocations to per-session microVM sandboxes, while a Workload Manager controls sandbox lifecycle, warm pools, and garbage collection. A minimal runtime daemon (PicoD) replaces SSH inside sandboxes, providing secure code execution, file operations, and JWT-based authentication with zero protocol overhead. Session state is stored in Redis/Valkey, enabling horizontal Router scaling. Two new Kubernetes CRDs — AgentRuntime and CodeInterpreter — are introduced to model agent workloads as first-class Kubernetes resources. A Python SDK, LangChain integration, and Dify plugin are included to make AgentCube immediately usable from popular AI frameworks.

What's New

Key Features Overview

Session-Based MicroVM Agent Routing: Stateful request routing with session affinity, backed by isolated microVM sandboxes per session
AgentRuntime and CodeInterpreter CRDs: Kubernetes-native abstractions for conversational agent and secure code interpreter workloads
Warm Pool for Fast Cold Starts: Pre-warmed sandbox pool support for CodeInterpreter, reducing invocation latency via SandboxClaim adoption
PicoD Runtime Daemon: Lightweight HTTP daemon replacing SSH inside sandboxes — code execution, file I/O, JWT authentication
JWT Security Chain (Router → PicoD): RSA-2048 key pair generated at startup; public key distributed via Kubernetes Secret and injected into sandbox pods
Dual GC Policy (Idle TTL + Max Duration): Background garbage collector enforces both idle timeout and hard maximum session duration
Python SDK and AI Framework Integrations: Out-of-the-box SDK with LangChain and Dify plugin support

Session-Based MicroVM Agent Routing

AI agent workloads are fundamentally stateful and interactive. A single agent session may span many invocations — tool calls, environment inspections, multi-step reasoning — all requiring the same isolated execution environment. Kubernetes has no native concept of persistent, identity-bound agent sessions. AgentCube fills this gap by mapping session IDs to dedicated microVM sandbox pods.

The Router acts as the data plane entry point. It reads the x-agentcube-session-id request header to look up an existing session in the store, or allocates a new sandbox via the Workload Manager when no session exists. Every response carries the x-agentcube-session-id header, enabling stateless clients to maintain session continuity across requests.

Key Capabilities:

Session affinity via header: x-agentcube-session-id header maps requests to existing sandbox pods
Transparent sandbox allocation: new sessions trigger automatic sandbox creation with no client-side configuration
Reverse proxy with path-prefix matching: path-based routing to multiple exposed sandbox ports
HTTP/2 (h2c) support: low-latency connections to sandbox endpoints
Configurable concurrency limit: MaxConcurrentRequests prevents overload

Router Endpoints:

POST /v1/namespaces/{namespace}/agent-runtimes/{name}/invocations/*path
GET  /v1/namespaces/{namespace}/agent-runtimes/{name}/invocations/*path
POST /v1/namespaces/{namespace}/code-interpreters/{name}/invocations/*path
GET  /v1/namespaces/{namespace}/code-interpreters/{name}/invocations/*path

Agent as First-Class Citizen in Kubernetes

Two distinct workload profiles emerge in the AI agent space: conversational/tool-using agents that need access to credentials, volumes, and custom networking; and short-lived code interpreters that require strict isolation and resource caps. Modeling both as first-class Kubernetes CRDs enables declarative configuration, RBAC integration, and GitOps-friendly workflows.

AgentRuntime (agentruntimes.runtime.agentcube.volcano.sh):

Designed for general-purpose AI agents. Accepts a full Kubernetes PodSpec template, allowing volume mounts, credential injection, sidecar containers, and custom resource requests.

spec.podTemplate — full PodSpec for sandbox pod
spec.targetPort — list of exposed ports with path prefix, port, and protocol
spec.sessionTimeout — idle session expiry (default: 15m)
spec.maxSessionDuration — hard maximum session lifetime (default: 8h)

CodeInterpreter (codeinterpreters.runtime.agentcube.volcano.sh):

Designed for secure, multi-tenant code execution. More locked-down than AgentRuntime, with a constrained sandbox template that restricts image, resources, and runtime class.

spec.template — CodeInterpreterSandboxTemplate (image, imagePullPolicy, resources, runtimeClassName)
spec.ports — list of exposed ports with path prefix
spec.sessionTimeout / spec.maxSessionDuration — session lifecycle bounds
spec.warmPoolSize — optional pre-warmed sandbox pool size
spec.authMode — picod (default, RSA/JWT) or none (delegate auth to sandbox)

Alpha Feature Notice: APIs are under active development. Spec fields and default values may change in future releases.

Warm Pool for Fast Cold Starts

Creating a microVM sandbox from scratch on every session request incurs a cold-start penalty that is unacceptable for interactive workloads. AgentCube introduces a warm pool mechanism: the Workload Manager pre-creates a configurable number of idle Sandbox pods and keeps them ready. When an invocation arrives, the Router claims a pre-warmed pod via a SandboxClaim CR instead of waiting for a new pod to start. The pool is automatically replenished after each claim.

Key Capabilities:

spec.warmPoolSize on CodeInterpreter controls pool depth
SandboxTemplate + SandboxClaim pattern delegates pod adoption to the upstream agent-sandbox controller
Pool refills asynchronously after each claim, keeping steady-state latency low
Cold-start path remains available when pool is exhausted

PicoD — Lightweight Sandbox Runtime Daemon

Traditional code sandbox implementations use SSH to execute commands remotely. SSH carries significant overhead: key management, multiplexing negotiation, and a heavyweight protocol for what are essentially single-request RPCs. PicoD replaces SSH with a minimal HTTP/1.1 daemon that runs inside each sandbox pod, providing code execution, file I/O, and authentication via a small, auditable binary.

Key Capabilities:

Code execution (POST /api/execute): runs arbitrary commands with configurable timeout, working directory, and environment variables; returns stdout, stderr, exit code, and wall-clock duration
File upload / write (POST /api/files): supports multipart form-data and JSON/base64 content for workspace-scoped file creation and updates
File download / read (GET /api/files/*path): streams files from the sandbox workspace using path-addressed operations
Health check (GET /health): exposes an unauthenticated liveness endpoint
JWT authentication: validates RS256 tokens from the Router; rejects unauthenticated requests
Path sanitization: all paths are jailed to the configured workspace root, preventing directory traversal
32 MB request body limit with configurable workspace root via --workspace flag

JWT Security Chain (Router → PicoD)

Sandbox pods are ephemeral and may be replaced at any time; embedding a shared secret in cluster config is fragile and hard to rotate. AgentCube establishes an RSA-based trust chain: the Router generates an RSA-2048 key pair at startup, stores the public key in a Kubernetes Secret (picod-router-identity), and the Workload Manager injects it as PICOD_AUTH_PUBLIC_KEY for CodeInterpreter sandboxes when authentication is enabled (the default is picod; none disables injection). The Router signs short-lived (5-minute) RS256 JWTs for every proxied request. PicoD verifies these tokens entirely in-process — no network round-trip, no shared database.

Key Capabilities:

RSA-2048 key pair auto-generated at Router startup
Public key distributed via picod-router-identity Kubernetes Secret
Workload Manager injects public key into CodeInterpreter sandbox env when spec.authMode is not none
5-minute token expiry limits blast radius of token leakage
PicoD rejects any request without a valid Router-issued JWT

Sandbox Lifecycle Management and GC

Agent sessions that complete their work or are abandoned by clients must be automatically reclaimed to avoid resource exhaustion. AgentCube implements a dual garbage collection policy enforced by a background loop in the Workload Manager:

Idle timeout: sandboxes inactive beyond spec.sessionTimeout (default 15m) are deleted
Hard max TTL: sandboxes older than spec.maxSessionDuration (default 8h) are deleted regardless of activity

Key Capabilities:

Configurable GC interval in the Workload Manager
UpdateSessionLastActivity store operation to reset idle timer on each invocation
ListExpiredSandboxes and ListInactiveSandboxes store queries feed the GC loop
Workload Manager deletes Sandbox / SandboxClaim CRs and removes store records atomically

Other Notable Changes

Features and Enhancements

Python SDK (agentcube-sdk): CodeInterpreterClient with execute_command(), run_code(language, code), upload_file(), download_file(), write_file(); session lifecycle managed automatically
LangChain integration: CodeInterpreterClient can be wrapped as a @tool and wired into LangGraph ReAct agents — see [devguide](https://github.com/volcano-sh/agentcube/blob/main/docs/devguide/code-interpreter-using-...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Summary

What's New

Key Features Overview

Session-Based MicroVM Agent Routing

Agent as First-Class Citizen in Kubernetes

Warm Pool for Fast Cold Starts

PicoD — Lightweight Sandbox Runtime Daemon

JWT Security Chain (Router → PicoD)

Sandbox Lifecycle Management and GC

Other Notable Changes

Features and Enhancements

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: volcano-sh/agentcube

AgentCube v0.1.0

Summary

What's New

Key Features Overview

Session-Based MicroVM Agent Routing

Agent as First-Class Citizen in Kubernetes

Warm Pool for Fast Cold Starts

PicoD — Lightweight Sandbox Runtime Daemon

JWT Security Chain (Router → PicoD)

Sandbox Lifecycle Management and GC

Other Notable Changes

Features and Enhancements

Uh oh!

v0.1.0-rc.0

Uh oh!

v0.1.0-alpha.0

Uh oh!