Skip to content

Cloudflare Sandbox Workspace

The CloudflareSandboxWorkspace is the full-featured Cloudflare provider — a real Linux container (Workers Container, Firecracker microVM) backing a complete fs + shell + code + snapshot workspace surface. Use this when your agent needs to execute untrusted code, run shell commands, or snapshot state.

When to use

  • Agents that need to execute code (Python, JavaScript) the LLM produces.
  • Agents that need a real shell (grep, find, git, npm install).
  • Agents that need to snapshot state for branch/restore patterns.
  • Any production workload where untrusted input might end up in shell commands or code.

If you only need durable file storage (no shell, no code), use the lighter Cloudflare Filestore instead.

Capabilities supported

CapabilitySupported
fs
shell✅ (with real-time stdout/stderr streaming)
code✅ (Python, JavaScript; optional persistent contexts)
snapshot✅ (R2-backed; restore creates NEW sandbox)
script✅ (opt-in — requires the loader option; see Script tier)

All four core capabilities are supported, so capability-mismatch errors are unlikely on this provider — but the same WorkspaceFailedError at session start applies if your config declares a capability the provider hasn't been configured for (e.g. snapshot without backupR2Binding, or script without loader). See the error-model table on the workspaces overview.

A fifth capability, script (the lightweight Worker-Loader isolate runner), is available as an opt-in dual-tier alongside code — see Script tier (dual-tier) below.

Optional peer dependency

@cloudflare/sandbox is heavy (Cloudflare Containers + sandbox runtime). It's an OPTIONAL peer dep on @helix-agents/runtime-cloudflare — users who only want the filestore workspace pay zero install cost.

bash
npm install @cloudflare/sandbox@0.10.3

The version is pinned EXACTLY (no caret) because, although Cloudflare Containers + Sandboxes reached GA as of 0.10.x, the API is still moving across 0.x releases. Bump consciously when you need a new version.

Subpath import (4.0.0+)

Sandbox classes (CloudflareSandboxWorkspaceProvider, etc.) are imported from the /sandbox subpath rather than the main package barrel:

typescript
import { CloudflareSandboxWorkspaceProvider } from '@helix-agents/runtime-cloudflare/sandbox';

This isolates @cloudflare/sandbox types from filestore-only consumers' type resolution. See Upgrading → migrating to 4.0.0.

Wrangler setup

The sandbox lives in its OWN Durable Object (separate from the agent DO). You declare the Sandbox DO + the container binding + the agent DO + (optionally) an R2 bucket for backups.

toml
# Agent DO (your AgentServer subclass)
[[durable_objects.bindings]]
name = "AGENTS"
class_name = "MyAgentServer"

# Sandbox DO (re-exported from @cloudflare/sandbox)
[[durable_objects.bindings]]
name = "SANDBOX"
class_name = "Sandbox"

# Container binding (Workers Container)
[[containers]]
class_name = "Sandbox"
image = "./Dockerfile"        # User-supplied; copy from @cloudflare/sandbox repo
max_instances = 5

[[migrations]]
tag = "v1"
new_sqlite_classes = ["MyAgentServer", "Sandbox"]

# Optional: R2 bucket for snapshot/restore
[[r2_buckets]]
binding = "BACKUPS"
bucket_name = "my-sandbox-backups"

The Dockerfile must be supplied by you — copy the reference from the @cloudflare/sandbox repo. It supports several variants (default, python, opencode, desktop).

Worker re-export

Wrangler needs the Sandbox class to be exported from your Worker entry:

typescript
// worker.ts
export { Sandbox } from '@cloudflare/sandbox';
export { MyAgentServer } from './my-agent-server.js';

If you want preview URLs from inside the sandbox, also call proxyToSandbox in your fetch handler — see the Cloudflare Sandbox SDK docs for details.

Provider config

typescript
interface CloudflareSandboxWorkspaceConfig {
  kind: 'cloudflare-sandbox';
  /**
   * Override the sandbox ID. Defaults to the session ID.
   *
   * Setting `id` to a non-default value causes EVERY session opening this
   * workspace to attach to the SAME container — silent cross-session
   * container sharing (session A's writes are visible to session B's reads).
   * To opt in, also set `shareAcrossSessions: true`; otherwise `open()`
   * throws. See "Cross-session sharing" below.
   */
  id?: string;
  /**
   * Opt-in acknowledgement that an explicit `id` is intentionally shared
   * across sessions. Default `false`. When `id` is set to a non-default
   * value and this flag is not `true`, `open()` throws.
   */
  shareAcrossSessions?: boolean;
  /** R2 binding name for backups. Required if capabilities.snapshot is true. */
  backupR2Binding?: string;
  /** Hostname for preview URLs (reserved; not surfaced in v1 modules). */
  hostname?: string;
  /** When true, close() calls sandbox.destroy(). Default: false (relies on sleepAfter). */
  destroyOnClose?: boolean;
  /**
   * Idle timeout. When set, forwarded to @cloudflare/sandbox's sleepAfter.
   * When unset, the provider does NOT pass the option through and the bundled
   * @cloudflare/sandbox SDK applies its own default (currently 10 minutes for
   * the bundled version — check the sandbox SDK release notes if the precise
   * default matters to you).
   */
  sleepAfter?: string | number;
  /** Working directory inside the container. Default: '/workspace'. */
  workspaceDir?: string;
  /** Directory snapshot() archives. Defaults to workspaceDir. */
  snapshotDir?: string;
  /** Env vars forwarded into the container. */
  envVars?: Record<string, string>;
  /** Languages exposed by the code interpreter. Default: ['python', 'javascript']. */
  codeLanguages?: readonly string[];
  /** Whether the code interpreter supports persistent contexts. Default: false. */
  codeStateful?: boolean;
}

Bucket mounts

Mount R2 or any S3-compatible bucket as a live directory inside the sandbox via the bucketMounts config. Mounts are applied at open() and persist until the sandbox is destroyed; they are not persisted on the workspace ref (keeping S3 credentials out of the state store).

typescript
// in the cloudflare-sandbox workspace config:
bucketMounts: [
  // R2 binding mount (binding declared in wrangler.jsonc; credential-less):
  { bucket: 'MY_BUCKET', mountPath: '/data' },
  // S3-compatible remote mount, read-only, mounting only a subdirectory of
  // the bucket via `prefix`:
  {
    bucket: 'my-bucket',
    mountPath: '/archive',
    prefix: '/2024', // mount only the bucket's /2024 subdirectory at /archive
    endpoint: 'https://s3.us-west-2.amazonaws.com',
    credentials: { accessKeyId: env.AWS_ACCESS_KEY_ID, secretAccessKey: env.AWS_SECRET_ACCESS_KEY },
    readOnly: true,
  },
];

Each BucketMountConfig accepts:

  • bucket (required) — the R2 binding name (production) or S3-compatible bucket name (with endpoint).
  • mountPath (required) — the absolute path inside the sandbox to mount at; must start with / and be unique across mounts.
  • prefix (optional) — a subdirectory within the bucket to mount (so only that subtree appears at mountPath); must start with /.
  • endpoint (optional) — S3-compatible endpoint URL; omit for an R2-binding mount.
  • credentials (optional) — { accessKeyId, secretAccessKey } for a remote (S3) endpoint; omit for R2-binding mounts.
  • readOnly (optional) — mount read-only. Default false.

Validation at open(): mountPath must be absolute and unique across mounts; prefix (a subdirectory within the bucket) must start with /; bucket must be non-empty; and credentials require an endpoint (an R2 binding mount uses neither). A mount whose bucket/endpoint is misconfigured fails fast at open().

endpoint is operator/developer-trusted config: do NOT derive it from agent or end-user input. It points s3fs at an arbitrary host, so an attacker-controlled value is an SSRF surface. Note also that local-dev R2-binding sync (wrangler dev / localBucket) is not surfaced by bucketMounts; an R2-binding mount (no endpoint) always uses the production credential-less egress path.

Operational notes:

  • An R2-binding mount requires the bucket declared as an [[r2_buckets]] binding in wrangler.jsonc, resolved DO-side by name. This is SEPARATE from the provider's backupBuckets map (which is only for snapshot backups) — bucketMounts bindings are NOT passed through backupBuckets.
  • Mounts are applied at open() and persist with the container until it is destroyed. They are NOT re-applied on resolve(). If the container is recreated (idle GC past sleepAfter, eviction, or destroyOnClose) and a session resumes via resolve(), the mounts are gone and not re-established until a fresh open() — reads under the mount path will fail. (Same lifecycle as envVars.)
  • Mounts are NOT carried onto a restored/branched sandbox (snapshot restore()/branch() create a new container); re-declare bucketMounts if the restored workspace needs them.
  • Mounts apply sequentially; a mid-list failure leaves earlier mounts applied, so a retried open() may encounter already-mounted paths.
  • Keep snapshotDir disjoint from any mount path: if a bucket is mounted inside snapshotDir, createBackup will archive the entire mounted bucket into the snapshot (large/slow/wrong).
  • A mount whose mountPath is outside workspaceDir is reachable only via ws.shell, not ws.fs (which scope-checks to workspaceDir).

Provider wiring

typescript
import {
  AgentRegistry,
  createAgentServer,
} from '@helix-agents/runtime-cloudflare';
import { CloudflareSandboxWorkspaceProvider } from '@helix-agents/runtime-cloudflare/sandbox';
import type { WorkspaceProvider } from '@helix-agents/core';
import type { Sandbox } from '@cloudflare/sandbox';
export { Sandbox } from '@cloudflare/sandbox';

interface Env {
  AGENTS: DurableObjectNamespace;
  SANDBOX: DurableObjectNamespace<Sandbox>;
  BACKUPS?: R2Bucket;
  OPENAI_API_KEY: string;
}

export const MyAgentServer = createAgentServer<Env>({
  llmAdapter: (env) => /* ... */,
  agents: registry,
  workspaceProviders: (env) =>
    new Map<string, WorkspaceProvider>([
      [
        'cloudflare-sandbox',
        new CloudflareSandboxWorkspaceProvider({
          namespace: env.SANDBOX,
          backupBuckets: env.BACKUPS ? { BACKUPS: env.BACKUPS } : undefined,
        }),
      ],
    ]),
});

The provider takes { namespace, backupBuckets?, shellConstraints?, maxGlobalConcurrentOpens?, loader? } — the namespace points at the Sandbox DO, backupBuckets is a name → bucket map used to resolve config.backupR2Binding at open() time, shellConstraints carries the SAME allowlist + maxDuration policy as the agent's capabilities.shell config (round-4 A2: defense-in-depth so direct ws.shell.run() calls from custom user tools honor the same policy as the auto-injected workspace_run tool), maxGlobalConcurrentOpens (round-5 B2) bounds concurrent open() and resolve() calls into THIS provider across ALL sessions sharing it, and loader is an optional worker_loaders binding that enables the lightweight script tier alongside the container (see Script tier (dual-tier)).

Tuning concurrency (round-5 B2)

Two layered semaphores bound concurrent opens. With one workspace per agent the per-session semaphore is effectively a single-open guard, so the per-process knob is the one that matters when many sessions share a provider instance.

  • Per-session — WorkspaceRegistryDeps.maxConcurrentOpens (or JSAgentExecutor / DurableObjectAgentConfig.workspaceMaxConcurrentOpens). Bounds opens for ONE session. With a single workspace per agent this is effectively a single-open guard, retained for parity with the prior interface.
  • Per-process — CloudflareSandboxWorkspaceProviderOptions.maxGlobalConcurrentOpens (the relevant knob when sharing a binding). Bounds opens ACROSS all sessions sharing the provider instance. Primarily relevant when many sessions in one orchestrator share a single Sandbox DO binding — set to match the binding's max_instances (often 50) to prevent cascading-failure from quota overruns. When each session has its own Sandbox binding (the common pattern), this knob is typically unnecessary.

Leave the per-session knob at the default Infinity. Set the per-process knob only when many sessions in one orchestrator share a CF Sandbox binding.

Cross-session sharing (round-4 A6)

By default each session opens its own sandbox (sandbox ID = sessionId). Setting config.id to a fixed value means EVERY session opening this workspace attaches to the SAME container — fine for an admin tool that wants persistent state across users, but silent cross-session data sharing in any deployment with multiple sessions.

The shareAcrossSessions: true flag is required to opt in:

typescript
// REJECTED at open() — shared id without the explicit flag.
{ kind: 'cloudflare-sandbox', id: 'shared-sandbox' }

// ACCEPTED — explicit acknowledgement that this is intentional.
{ kind: 'cloudflare-sandbox', id: 'shared-sandbox', shareAcrossSessions: true }

When the flag is set, open() ALSO emits a logger.warn('cloudflare-sandbox: shared workspace id detected', ...) for the audit trail. Pre-fix, the cross-session sharing was silent — code review couldn't catch the misconfig because it was indistinguishable from the default. Post-fix, the explicit-opt-in pattern makes accidental tenancy bleeds impossible.

The same pattern applies to CloudflareFileStoreWorkspaceConfig.namespace (see cloudflare-filestore.md).

Lifecycle

  • open() — calls getSandbox(namespace, sandboxId) to obtain a stub. Configures sleepAfter and envVars if specified. Constructs module adapters conditionally on the agent's declared capabilities: it builds only the modules the agent asked for (fs only when capabilities.fs is truthy, and likewise for shell / code / snapshot). When open() is called with NO declared capabilities (permitted by the WorkspaceProvider interface), it falls back to constructing all of them. A declared snapshot capability without a backupR2Binding still lands a snapshot module that throws a clear error at snapshot()-time. The optional script tier (Worker-Loader isolate runner) is added only when the provider was constructed with a loader binding AND capabilities.script is declared — an explicit script declaration with no loader throws at open().
  • resolve() — re-attaches via getSandbox(namespace, sandboxId). Sandbox DO is INDEPENDENT of the agent DO, so sessions survive agent-DO hibernation cleanly: agent wakes, calls resolve(), gets a stub to the same persistent sandbox.
  • close() — by default a no-op (the sandbox idle-shuts-down via sleepAfter). With destroyOnClose: true, calls sandbox.destroy() to permanently tear the container down (one-shot agent run pattern).

Cost notes

  • Container cold start is ~2–3 seconds on first request (Firecracker microVM boot).
  • sleepAfter controls when the container suspends after idle. The framework does NOT set a default — when sleepAfter is unset, the bundled @cloudflare/sandbox SDK applies its own default (currently ~10 minutes for the bundled version; check the sandbox SDK release notes if the precise value matters). Lower values save money; higher values reduce cold-start latency.
  • destroyOnClose: true kills the container at session end. Use for one-shot workloads (agent runs once, never resumed). Default false (preserve container for fast resume).

Concurrency: max_instances interaction (round-4 cluster C)

The max_instances knob on the [[containers]] Wrangler binding caps how many container instances a single Sandbox DO can hold concurrently. With max_instances = 5, an agent that declares 100 workspaces and runs workspaceOpenStrategy: 'eager' will hit cascading failures — the registry's openAll() fires 100 concurrent opens via Promise.all, but only 5 can land at once.

The fix: set workspaceMaxConcurrentOpens on the executor to match the binding's max_instances:

typescript
export const MyAgentServer = createAgentServer<Env>({
  workspaceProviders: (env, ctx) =>
    new Map([
      [
        /* ... */
      ],
    ]),
  workspaceMaxConcurrentOpens: 5, // match max_instances
});

The registry then funnels opens through a semaphore — at most N in flight at any moment. This keeps many sessions sharing one Sandbox binding from cascade-failing the binding's quota.

Snapshot semantics

snapshot() calls sandbox.createBackup({ dir: snapshotDir }) which archives the directory to R2.

restore(ref) and branch(ref) create a NEW sandbox ID ({originId}-restored-{shortId} or -branch-), obtain a stub to that new sandbox, and call restoreBackup on it. Both return a fresh WorkspaceRef pointing at the new sandbox — the Snapshotter module treats snapshots as forks rather than mutations.

The original sandbox is unchanged after a restore/branch. See the Snapshotter module for the full semantics.

backupR2Binding is required when declaring capabilities.snapshot: true. Without it, snapshot() throws at call time.

Default TTL: createBackup applies a default TTL (~3 days) after which the backup is auto-GC'd by the SDK; snapshots silently expire — don't rely on an old snapshot for long-term recovery (set/track TTL accordingly).

Round-7 — list() and delete()

list() enumerates snapshots owned by the current session (in-process tracking) by default; pass allowCrossSession: true to prefix-scan the entire R2 backup namespace. The auto-injected workspace_list_snapshots tool exposes this to the LLM.

delete() removes both R2 keys (backups/<id>/data.sqsh and backups/<id>/meta.json) for the referenced backup. Idempotent — re-deleting an already-removed ref resolves successfully. The auto-injected workspace_delete_snapshot tool exposes this to the LLM.

@cloudflare/sandbox@0.10.3 does NOT expose native list/delete on ISandbox, so the snapshotter goes directly to the configured backupR2Binding using the SDK's documented backup-key layout. If the SDK ships native methods we'll switch to them and drop the R2 binding coupling.

Why these matter for cost: pre-round-7 there was no framework-side way to prune accumulated snapshots — operators relied on out-of-band R2 lifecycle rules. With list_snapshots + delete_snapshot available to the LLM, agents that snapshot once per checkpoint can self-prune; operators with stricter retention rules can complement R2 lifecycle policies with framework-driven pruning. See the Snapshotter module — Pruning section for patterns and caveats.

Code interpreter

Two modes:

  • Stateless (codeStateful: false, default): each runCode call is independent. The LLM sees workspace_run_code(language, code).
  • Stateful (codeStateful: true): persistent Jupyter-style contexts. The LLM sees workspace_create_code_context, workspace_run_in_code_context, workspace_delete_code_context tools too — variables persist across workspace_run_in_code_context calls within a context.

codeLanguages declares which languages the LLM may request. Default: ['python', 'javascript']. The container image must support whatever languages you declare.

Script tier (dual-tier)

The sandbox provider can ALSO expose the lightweight, ephemeral JavaScript isolate runner (script) alongside the full container code interpreter — a dual-tier setup. Declaring both capabilities.code and capabilities.script gives the LLM BOTH tools and it picks per task:

  • workspace_run_code — the full container: Python/JS, real toolchain, optional persistent contexts, durable fs.
  • workspace_script — a fast Worker-Loader isolate: JS-only, stateless, no durable fs, ~100x cheaper than booting the container.

The script tier is the SAME runner as the standalone cloudflare-dynamic-worker provider — here it composes with the container tiers instead of being the only capability.

Enabling the script tier

Pass a Worker Loader binding (env.LOADER) as the provider's loader option:

typescript
import { CloudflareSandboxWorkspaceProvider } from '@helix-agents/runtime-cloudflare/sandbox';

new CloudflareSandboxWorkspaceProvider({
  namespace: env.SANDBOX,
  backupBuckets: env.BACKUPS ? { BACKUPS: env.BACKUPS } : undefined,
  loader: env.LOADER, // worker_loaders binding — enables the `script` tier
});

Then declare script on the agent's workspace alongside code:

typescript
workspace: {
  provider: { kind: 'cloudflare-sandbox' },
  capabilities: {
    code: { languages: ['python', 'javascript'] }, // → workspace_run_code (container)
    script: { network: 'off', maxDurationMs: 5000 }, // → workspace_script (isolate)
  },
},

The worker_loaders binding is a separate wrangler declaration from the Sandbox DO + container:

jsonc
// wrangler.jsonc — in addition to the durable_objects / containers bindings above
{
  "worker_loaders": [{ "binding": "LOADER" }],
}

Fail-fast at open() without a loader

If an agent declares capabilities.script but the provider was constructed WITHOUT a loader, open() throws a WorkspaceFailedError immediately:

CloudflareSandboxWorkspaceProvider: capability 'script' declared but no Worker
Loader binding configured (set the provider's 'loader' option).

This is a deliberate fail-fast: unlike snapshot (whose module is constructed and only throws at snapshot()-time), a loader-less script runner can never run, so the provider rejects the misconfiguration up front. (The construct-all path — when declaredCapabilities is undefined — silently omits the script tier when no loader is wired; only an EXPLICIT script declaration without a loader throws.)

Round-trip across resolve()

The script config (network / maxDurationMs) is persisted on the workspace ref as the object-form capabilities.script, so it survives resolve() after a DO hibernation. compatibilityDate is NOT persisted — it is reconstructed from the shared DEFAULT_SCRIPT_COMPAT_DATE default. The script tier on resume also requires the provider's loader binding; without it the tier is silently omitted (the resumed workspace simply lacks ws.script).

v1 limits (same as the standalone provider)

JS-only (WebAssembly available in-isolate), stateless (fresh isolate per call, no state carry), ephemeral /tmp scratch (no durable fs), and network off by default (opt-in via script: { network: 'allow' }). For the full capability reference see the Cloudflare Dynamic Worker page.

Auto-injected tools

All four module surfaces (each auto-injected as a flat workspace_<op> tool):

  • fs: workspace_read_file, workspace_write_file, workspace_edit_file, workspace_ls, workspace_glob, workspace_grep, workspace_stat, workspace_mkdir, workspace_rm — see FileSystem module.
  • shell: workspace_run — see Shell module.
  • code: workspace_run_code, plus workspace_create_code_context / workspace_run_in_code_context / workspace_delete_code_context when codeStateful: true — see CodeInterpreter module.
  • snapshot: workspace_snapshot, workspace_restore, workspace_branch, workspace_list_snapshots, workspace_delete_snapshot — see Snapshotter module.
  • script (opt-in, requires loader): workspace_script — the fast Worker-Loader isolate runner, see Script tier (dual-tier) and the Cloudflare Dynamic Worker page.

Observability

The provider accepts an optional Logger from @helix-agents/core so workspace-side events surface in your logging pipeline:

typescript
import { consoleLogger } from '@helix-agents/core';

new CloudflareSandboxWorkspaceProvider({
  namespace: env.SANDBOX,
  backupBuckets: env.BACKUPS ? { BACKUPS: env.BACKUPS } : undefined,
  logger: consoleLogger, // pino, winston, or any { info, warn, error } shape
});

Defaults to silent (noopLogger). The provider currently emits info/warn entries during sandbox lifecycle transitions and is wired so future security-boundary additions surface without an API change.

Using the workspace from a custom tool

typescript
import { defineTool } from '@helix-agents/core';
import { z } from 'zod';

const runPython = defineTool({
  name: 'count_lines',
  parameters: z.object({ path: z.string() }),
  execute: async (input, ctx) => {
    const ws = await ctx.workspaces!.get();
    if (!ws.code) throw new Error('workspace requires code capability');
    const result = await ws.code.runCode(
      'python',
      `print(sum(1 for _ in open(${JSON.stringify(input.path)})))`
    );
    return { exitCode: result.exitCode, outputs: result.outputs };
  },
});

See the shared pattern on the overview pageawait on get() is required, and the ! non-null assertion on ctx.workspaces is appropriate when the agent declares a workspace.

File watching

The cloudflare-sandbox filesystem capability implements watch() (backed by the SDK's native inotify watcher, surfaced over the SDK's SSE event stream). It is a programmatic module method for custom tools — there is no auto-injected workspace_watch LLM tool, because a long-lived event stream does not fit the request/response tool model.

typescript
const unsubscribe = await ws.fs.watch('/workspace', async (event) => {
  // event.type is 'created' | 'modified' | 'deleted' | 'renamed'
  console.log(event.type, event.path);
});
// later: stop watching (tears down the underlying SDK stream)
unsubscribe();

The path is scoped to the workspace root like every other fs op. SDK move events (move_from/move_to) surface as renamed (without oldPath); the attrib event type is skipped.

Operational notes:

  • watch() uses the SDK defaults (recursive, with default excludes like .git/node_modules/.DS_Store); the watch(path, cb) signature exposes no options to change recursion/excludes.
  • Watchers are NOT auto-closed on workspace close() or session end. The caller MUST call the returned unsubscribe(), or the server-side inotify watcher + SSE stream leak until the container sleeps (inotify has kernel max_user_watches limits).

Inspecting a workspace

The container's filesystem is queryable via the shell, so the simplest path is a custom debug tool:

typescript
const inspect = defineTool({
  name: 'inspect_workspace',
  parameters: z.object({ path: z.string().default('/workspace') }),
  execute: async (input, ctx) => {
    const ws = await ctx.workspaces!.get();
    const result = await ws.shell!.run(`ls -la ${input.path}`);
    return { listing: new TextDecoder().decode(result.stdout) };
  },
});

There is no out-of-band path to peek at a running container's filesystem — the sandbox lives inside the Sandbox DO; ws.shell.run('ls /workspace') (or any other shell command) is the supported inspection surface.

Mid-run inspection (active sessions)

Inspecting an ACTIVE session needs care: the agent may be writing while you read.

  • Recommended for active sessions. The custom debug-tool path above (in-agent) — the read happens inside the same step the agent owns, so no race.
  • From operator code with binding access. If you have direct access to the Sandbox DO namespace (operator console, admin endpoint), you can call the underlying Sandbox DO directly:
    typescript
    const sandbox = getSandbox(env.SANDBOX, sandboxId);
    // Pick the read-only RPC the @cloudflare/sandbox SDK exposes
    // for shell-like inspection (e.g. an `exec`-style or `run`-style call).
    const result = await sandbox.run('ls -la /workspace');
    Use a read-only command (ls, cat, grep) for safety — write operations would race the agent.
  • Container-state safety. Even read-only ops cause the container to wake from sleepAfter hibernation, briefly changing its state visible to the agent. For most workloads this is fine; for cost-sensitive flows where every wake matters, prefer the in-agent debug tool.
  • For after-completion inspection. Either approach is safe; the agent has stopped writing.

Capacity & performance

These are approximate ranges; benchmark for your workload.

DimensionApproximate rangeNotes
Container cold start~2–3 sFirecracker microVM boot. First request after wake.
Warm fs/shell op latency~50ms (single-digit-tens)Round-trip into the Sandbox DO + container.
Code interpreter runCodeVariesDominated by language runtime startup unless codeStateful: true keeps a context warm.
Concurrent containers per Sandbox DOmax_instances (Wrangler binding cap; commonly 5)The hard upper bound.
Concurrent sessions per Sandbox DOmax_instances (one workspace per agent/session)The constraint is max_instances, not the agent layer.
Snapshot sizeR2-limitedcreateBackup archives snapshotDir; archive size depends on workload.

max_instances is the real bound. When many sessions share one Sandbox DO binding, the binding's max_instances caps how many containers can co-exist. Coordinate with Cluster C's workspaceMaxConcurrentOpens — see the next section.

Path scoping

All workspace fs operations are scoped to workspaceDir (default /workspace). Round-4 cluster A enforces this scoping in the FileSystem adapter:

  • All FS methods require paths inside workspaceDir.
  • Out-of-scope paths throw WorkspaceFailedError("path X is outside workspace root Y").
  • Path normalization: .. segments and symlinks are resolved before the scope check.
  • Custom tools using ws.fs!.readFile() (etc.) get the same scoping — the adapter is the same instance.

The shell capability does NOT enforce path scoping (a shell command is the user's escape hatch). Combining shell: true with untrusted input means the sandbox boundary is your security boundary; the workspaceDir scope is the FS boundary, not the container boundary.

Restart behavior

When an agent DO restarts (deployment rollout, eviction, code reload), the workspace is re-attached lazily on the first agent operation that triggers provider.resolve() for each persisted ref. For an agent DO with N persisted sandbox refs, the first operation post-restart issues up to N parallel getSandbox(NS, sandboxId) RPCs.

Thundering-herd risk during rollout. A platform-wide deployment rollout simultaneously restarts many agent DOs; each DO's first operation initiates its own resolve burst. Multiplied across many DOs, this is a classic thundering-herd against the Sandbox DO namespace.

The per-resolve cost is meaningful here (potentially a Sandbox DO wake), so the herd amplitude matters.

Recommended mitigation.

  • Set workspaceMaxConcurrentOpens to the binding's max_instances (e.g. 5). The registry funnels resolves through a semaphore — at most N concurrent per DO. This caps the per-DO burst and gives the upstream Sandbox DO time to absorb each resolve.
  • Combine with WorkspaceMetrics to alert on resolve-latency spikes during rollout windows.

Filed as follow-up: registry-side jitter on the first lazy resolve after recovery to spread the per-DO burst across a few hundred ms — would smooth the rollout further without operator action.

Operator visibility into hibernated containers (round-5 D12)

The framework does NOT enumerate hibernated containers. registry.describe() only surfaces the workspace declared by an ACTIVE session — once a session ends and the registry closes, the underlying Sandbox DO may continue to hold a hibernated container (subject to sleepAfter) but the framework has no view of it.

Where to look. Use Cloudflare-side observability for hibernated container counts:

  • Cloudflare dashboard. The Workers Container view shows DO instance counts and per-instance state. Hibernated containers count toward your max_instances budget until they're destroyed.
  • DO RPC (advanced). If you have direct access to the Sandbox DO namespace, calling getSandbox(NS, sandboxId).status() (or whatever read-only RPC the SDK version exposes) returns the container's current state. Loop the namespace's known IDs to enumerate.
  • destroyOnClose: true is the only framework-side knob that ensures containers don't persist past session end — set it for one-shot agent runs.

Filed as known follow-up: a registry-level listHibernatedContainers() helper that surfaces orphaned containers. Until it lands, operator visibility into hibernated containers lives in the CF dashboard, not the framework.

Limitations

  • Workflows runtime not supported. Workspaces require the DO runtime (createAgentServer); the Workflows runtime now fails fast at agent registration when workspaces are declared. See the Workflows runtime page.
  • Cross-DO sharing not supported. Sandboxes are session-scoped; one session = one sandbox.
  • Reserved modules absent. Desktop, Git, Net are reserved in core types but not implemented in v1. The library supports them; integration is deferred.
  • Hibernated container enumeration not surfaced. See Operator visibility into hibernated containers above.

Source

Released under the MIT License.