Skip to content

Cloudflare Sandbox Workspace

The CloudflareSandboxWorkspace is the full-featured Cloudflare provider — a real Linux container (Workers Container, Firecracker microVM) backing a complete fs + shell + code + snapshot workspace surface. Use this when your agent needs to execute untrusted code, run shell commands, or snapshot state.

When to use

  • Agents that need to execute code (Python, JavaScript) the LLM produces.
  • Agents that need a real shell (grep, find, git, npm install).
  • Agents that need to snapshot state for branch/restore patterns.
  • Any production workload where untrusted input might end up in shell commands or code.

If you only need durable file storage (no shell, no code), use the lighter Cloudflare Filestore instead.

Capabilities supported

CapabilitySupported
fs
shell✅ (with real-time stdout/stderr streaming)
code✅ (Python, JavaScript; optional persistent contexts)
snapshot✅ (R2-backed; restore creates NEW sandbox)

Optional peer dependency

@cloudflare/sandbox is heavy (Cloudflare Containers + sandbox runtime). It's an OPTIONAL peer dep on @helix-agents/runtime-cloudflare — users who only want the filestore workspace pay zero install cost.

bash
npm install @cloudflare/sandbox@0.8.11

The version is pinned EXACTLY (no caret) because the package is currently experimental and the API has been moving in 0.x. Bump consciously when you need a new version.

Wrangler setup

The sandbox lives in its OWN Durable Object (separate from the agent DO). You declare the Sandbox DO + the container binding + the agent DO + (optionally) an R2 bucket for backups.

toml
# Agent DO (your AgentServer subclass)
[[durable_objects.bindings]]
name = "AGENTS"
class_name = "MyAgentServer"

# Sandbox DO (re-exported from @cloudflare/sandbox)
[[durable_objects.bindings]]
name = "SANDBOX"
class_name = "Sandbox"

# Container binding (Workers Container)
[[containers]]
class_name = "Sandbox"
image = "./Dockerfile"        # User-supplied; copy from @cloudflare/sandbox repo
max_instances = 5

[[migrations]]
tag = "v1"
new_sqlite_classes = ["MyAgentServer", "Sandbox"]

# Optional: R2 bucket for snapshot/restore
[[r2_buckets]]
binding = "BACKUPS"
bucket_name = "my-sandbox-backups"

The Dockerfile must be supplied by you — copy the reference from the @cloudflare/sandbox repo. It supports several variants (default, python, opencode, desktop).

Worker re-export

Wrangler needs the Sandbox class to be exported from your Worker entry:

typescript
// worker.ts
export { Sandbox } from '@cloudflare/sandbox';
export { MyAgentServer } from './my-agent-server.js';

If you want preview URLs from inside the sandbox, also call proxyToSandbox in your fetch handler — see the Cloudflare Sandbox SDK docs for details.

Provider config

typescript
interface CloudflareSandboxWorkspaceConfig {
  kind: 'cloudflare-sandbox';
  /** Override the sandbox ID. Defaults to the session ID. */
  id?: string;
  /** R2 binding name for backups. Required if capabilities.snapshot is true. */
  backupR2Binding?: string;
  /** Hostname for preview URLs (reserved; not surfaced in v1 modules). */
  hostname?: string;
  /** When true, close() calls sandbox.destroy(). Default: false (relies on sleepAfter). */
  destroyOnClose?: boolean;
  /** Idle timeout. Forwarded to @cloudflare/sandbox's sleepAfter. Default: '10m'. */
  sleepAfter?: string | number;
  /** Working directory inside the container. Default: '/workspace'. */
  workspaceDir?: string;
  /** Directory snapshot() archives. Defaults to workspaceDir. */
  snapshotDir?: string;
  /** Env vars forwarded into the container. */
  envVars?: Record<string, string>;
  /** Languages exposed by the code interpreter. Default: ['python', 'javascript']. */
  codeLanguages?: readonly string[];
  /** Whether the code interpreter supports persistent contexts. Default: false. */
  codeStateful?: boolean;
}

Provider wiring

typescript
import {
  AgentRegistry,
  createAgentServer,
  CloudflareSandboxWorkspaceProvider,
} from '@helix-agents/runtime-cloudflare';
import type { WorkspaceProvider } from '@helix-agents/core';
import type { Sandbox } from '@cloudflare/sandbox';
export { Sandbox } from '@cloudflare/sandbox';

interface Env {
  AGENTS: DurableObjectNamespace;
  SANDBOX: DurableObjectNamespace<Sandbox>;
  BACKUPS?: R2Bucket;
  OPENAI_API_KEY: string;
}

export const MyAgentServer = createAgentServer<Env>({
  llmAdapter: (env) => /* ... */,
  agents: registry,
  workspaceProviders: (env) =>
    new Map<string, WorkspaceProvider>([
      [
        'cloudflare-sandbox',
        new CloudflareSandboxWorkspaceProvider({
          namespace: env.SANDBOX,
          backupBuckets: env.BACKUPS ? { BACKUPS: env.BACKUPS } : undefined,
        }),
      ],
    ]),
});

The provider takes { namespace, backupBuckets? } — the namespace points at the Sandbox DO, and backupBuckets is a name → bucket map used to resolve config.backupR2Binding at open() time.

Lifecycle

  • open() — calls getSandbox(namespace, sandboxId) to obtain a stub. Configures sleepAfter and envVars if specified. Constructs all four module adapters (fs, shell, code, snapshot) regardless of capabilities — declared capabilities drive tool injection, not module construction.
  • resolve() — re-attaches via getSandbox(namespace, sandboxId). Sandbox DO is INDEPENDENT of the agent DO, so sessions survive agent-DO hibernation cleanly: agent wakes, calls resolve(), gets a stub to the same persistent sandbox.
  • close() — by default a no-op (the sandbox idle-shuts-down via sleepAfter). With destroyOnClose: true, calls sandbox.destroy() to permanently tear the container down (one-shot agent run pattern).

Cost notes

  • Container cold start is ~2–3 seconds on first request (Firecracker microVM boot).
  • sleepAfter controls when the container suspends after idle. Default '10m'. Lower values save money; higher values reduce cold-start latency.
  • destroyOnClose: true kills the container at session end. Use for one-shot workloads (agent runs once, never resumed). Default false (preserve container for fast resume).

Snapshot semantics

snapshot() calls sandbox.createBackup({ dir: snapshotDir }) which archives the directory to R2.

restore(ref) and branch(ref) create a NEW sandbox ID ({originId}-restored-{shortId} or -branch-), obtain a stub to that new sandbox, and call restoreBackup on it. Both return a fresh WorkspaceRef pointing at the new sandbox — Pattern 3e treats snapshots as forks rather than mutations.

The original sandbox is unchanged after a restore/branch. See the Snapshotter module for the full semantics.

backupR2Binding is required when declaring capabilities.snapshot: true. Without it, snapshot() throws at call time.

Code interpreter

Two modes:

  • Stateless (codeStateful: false, default): each runCode call is independent. The LLM sees workspace__<name>__run_code(language, code).
  • Stateful (codeStateful: true): persistent Jupyter-style contexts. The LLM sees create_code_context, run_in_code_context, delete_code_context tools too — variables persist across run_in_code_context calls within a context.

codeLanguages declares which languages the LLM may request. Default: ['python', 'javascript']. The container image must support whatever languages you declare.

Auto-injected tools

All four module surfaces:

Limitations

  • Workflows runtime not supported. Workspaces require the DO runtime (createAgentServer).
  • Cross-DO sharing not supported. Sandboxes are session-scoped; one session = one sandbox.
  • Reserved modules absent. Desktop, Git, Net are reserved in core types but not implemented in v1. The library supports them; integration is deferred.

Source

Released under the MIT License.