Building a Provider

This page is for developers writing their own WorkspaceProvider. If you're using one of the seven built-in providers, you don't need to read it.

When you'd build your own

The provider you need isn't in the built-in set (e.g., E2B, Modal, Daytona, your own Firecracker host).
You have a proprietary backing store and want to plug it into the workspace abstraction.
You're benchmarking a new platform.

The `WorkspaceProvider<TConfig>` contract

typescript

interface WorkspaceProvider<TConfig = unknown> {
  readonly providerId: string;
  // `declaredCapabilities` is optional: providers MAY ignore it and construct
  // all their modules, or read it to skip allocating modules nobody will use.
  open(
    config: TConfig,
    session: SessionRef,
    declaredCapabilities?: WorkspaceCapabilityFlags
  ): Promise<OpenedWorkspace>;
  resolve(ref: WorkspaceRef): Promise<Workspace>;
}

interface OpenedWorkspace {
  readonly ws: Workspace;
  readonly ref: WorkspaceRef;
}

interface WorkspaceRef {
  readonly providerId: string;
  readonly ref: unknown; // your serializable payload
  readonly capabilities: WorkspaceCapabilityFlags;
  readonly schemaVersion: number; // version of the `ref` payload (see migration contract below)
}

Three required pieces:

providerId — a string the registry uses to find your provider. The discriminator on WorkspaceConfig.provider.kind matches this.
open(config, session) — called the first time the agent uses a workspace tool. Construct the live Workspace object + a serializable WorkspaceRef for crash recovery. Both are returned.
resolve(ref) — called after a runtime boundary (DO hibernation, Temporal replay, executor restart) to reconstruct the workspace from the persisted ref.

The `WorkspaceConfig` discriminator

Your config type MUST have a kind field (string literal type) — the registry uses it to find your provider:

typescript

interface MyProviderConfig {
  readonly kind: 'my-provider';
  // ... your other config fields
}

The user declares it in defineAgent:

typescript

workspace: {
  provider: { kind: 'my-provider', /* ... your fields ... */ },
  capabilities: { fs: true },
},

The lifecycle

1. User declares the workspace in defineAgent({...}).
2. Framework calls executor.execute(agent, ...).
3. Agent's first tool call hits the workspace registry.
4. Registry sees no live workspace; calls provider.open(config, session).
5. You return { ws, ref }. Live ws goes in the registry; ref is persisted.
6. Subsequent tool calls reuse the cached live ws.

[runtime boundary: DO hibernation, replay, restart]

7. Framework calls executor.resume(...) on a fresh runtime.
8. Registry sees no live workspace; calls provider.resolve(ref).
9. You reconstruct the live Workspace from the ref payload + return it.
10. Tool calls resume normally.

[session end]

11. Framework calls ws.close().

Your job: implement steps 5 and 9 (and step 11 if your workspace needs cleanup).

What goes in the ref payload

Everything resolve() needs to reconstruct the live workspace WITHOUT having the original config or session available. Typical contents:

The workspace's identity (id, namespace, etc.).
Names of bindings to look up at resolve-time (e.g., R2 bucket binding name; do NOT serialize the bucket object itself).
Provider-specific options that affect how the workspace was constructed (workspaceDir, sleepAfter, etc.).

What NOT to put in the ref:

Live objects (sandbox stubs, file handles, sockets). They don't survive serialization.
Secrets. Refs may be persisted to durable storage you don't fully control.
Anything you can re-derive from the runtime context.

Per-session state contract — providers MUST be stateless across sessions

⚠️ Read this section before writing your first provider. Failure to honor it is the single biggest source of subtle bugs in custom providers.

A WorkspaceProvider is constructed ONCE at executor / DO boot and reused across MANY sessions over its lifetime. The same provider instance services every session that lands on that process — there is no per-session provider instance.

This means: anything you store on this in your provider class will leak across sessions. A session-A write becomes a session-B read, with no isolation.

What goes where

State kind	Where to put it
Per-session tmpdir paths, sandbox container IDs, R2 namespace prefixes	Inside the `Workspace` returned by `open()` (closure-captured)
Per-session file caches, in-memory state maps	Inside the `Workspace`
Per-session cleanup state needed by `close()`	Inside the `Workspace`
Shared infrastructure handles (DO bindings, R2 bindings)	On the provider instance — these are process-wide
Shared config (logger, providerId, region)	On the provider instance

Bad pattern (cross-session leak)

typescript

class BadProvider implements WorkspaceProvider {
  readonly providerId = 'bad';
  // BAD: instance state retained across sessions
  private files = new Map<string, Uint8Array>();

  async open(_config, _session): Promise<OpenedWorkspace> {
    return {
      ws: {
        id: 'ws-bad',
        fs: {
          readFile: async (p) => this.files.get(p)!, // sees other sessions' data!
          writeFile: async (p, d) => {
            this.files.set(p, d);
          }, // visible to other sessions!
          // ...
        },
        close: async () => {},
      },
      ref: { providerId: 'bad', ref: {}, capabilities: { fs: true } },
    };
  }
  async resolve() {
    /* ... */
  }
}

Good pattern (per-session closure)

typescript

class GoodProvider implements WorkspaceProvider {
  readonly providerId = 'good';
  // OK: process-wide shared handles only
  constructor(private readonly logger: Logger) {}

  async open(_config, session): Promise<OpenedWorkspace> {
    // Per-session state captured in the Workspace closure, NOT on `this`.
    const files = new Map<string, Uint8Array>();
    return {
      ws: {
        id: `ws-${session.sessionId}`,
        fs: {
          readFile: async (p) => files.get(p)!,
          writeFile: async (p, d) => {
            files.set(p, d);
          },
          // ...
        },
        close: async () => {
          files.clear();
        },
      },
      ref: {
        providerId: 'good',
        ref: { sessionId: session.sessionId },
        capabilities: { fs: true },
      },
    };
  }
  async resolve() {
    /* ... */
  }
}

The provider conformance suite includes a regression test for this contract — see packages/core/src/workspace/__tests__/provider.test.ts (the C10 round-4 test case).

Module construction strategies

open() accepts an OPTIONAL third arg: the agent's declared WorkspaceCapabilityFlags. You can use it (or ignore it) — both behaviors are valid.

Strategy A: always construct everything (back-compat default)

Simplest pattern; the third arg is ignored. The provider constructs every module it can support, regardless of what the agent declared. The framework's tool-injection layer wires only the declared capabilities, so unused modules are inert (allocated but never called).

typescript

async open(config, session) {
  const ws = new MyWorkspace({
    fs: new MyFs(/* ... */),
    shell: new MyShell(/* ... */),
  });
  return { ws, ref };
}

Use this strategy when modules are cheap to construct and you want simple, predictable code.

Strategy B: skip unused modules (D3 round-4)

When a module's constructor does meaningful work (allocates pools, opens sockets, primes caches), use the declaredCapabilities arg to skip construction for unused modules. The built-in cloudflare-sandbox provider uses this strategy as of D3.

typescript

async open(config, session, declaredCapabilities) {
  // declaredCapabilities is undefined for back-compat callers — fall back to "build everything".
  const wantFs = declaredCapabilities ? Boolean(declaredCapabilities.fs) : true;
  const wantShell = declaredCapabilities ? Boolean(declaredCapabilities.shell) : true;
  const ws = new MyWorkspace({
    fs: wantFs ? new MyFs(/* ... */) : undefined,
    shell: wantShell ? new MyShell(/* ... */) : undefined,
  });
  const ref: WorkspaceRef = {
    providerId: this.providerId,
    ref: { /* payload */ },
    // CRITICAL: ref.capabilities must match what you actually built.
    capabilities: { fs: wantFs, shell: wantShell },
    schemaVersion: 2,
  };
  return { ws, ref };
}

Either strategy passes the registry's invariant assertion (declared ⊆ populated). The registry's check is the single source of truth — it runs regardless of which strategy you picked.

`WorkspaceCapabilityFlags` advertisement on the ref

You MUST set capabilities on the returned WorkspaceRef to match the modules your open() actually populated:

typescript

const ref: WorkspaceRef = {
  providerId: this.providerId,
  ref: {
    /* your payload */
  },
  capabilities: { fs: true, shell: true }, // what your live ws actually supports
  schemaVersion: 2, // stamp the current N (see N±1 contract below)
};

This is AUTHORITATIVE. The registry asserts at both open() and resolve() time that:

Every capability declared in the agent's WorkspaceConfig.capabilities is also truthy on WorkspaceRef.capabilities (the ref must be a superset of the declaration), AND
Each declared module is non-undefined on the returned Workspace.

If a user declares a capability your provider doesn't support, the registry throws WorkspaceFailedError at session start (NOT at LLM tool-call time). Tool injection still reads WorkspaceConfig.capabilities; the ref's capabilities are the provider-side guarantee that the wired tools will find their module on the live Workspace.

Ref schema versioning (D4 round-4)

WorkspaceRef carries an optional schemaVersion: number field. Persisted refs may live across deployments; the version field is the contract that lets a deploy of N safely consume refs from N-1 (and vice versa for rollbacks).

The N±1 contract:

Each provider declares a CURRENT version N (as of D4 round-4, all built-in providers are at N = 2).
Every ref produced by open() MUST stamp schemaVersion: N.
resolve() MUST accept refs with schemaVersion:
- undefined (legacy / pre-D4 refs)
- N - 1 (one back; back-compat for in-flight rollouts and rollbacks)
- N (current) Anything else throws WorkspaceFailedError with a message naming the unsupported version + the supported set.

The framework provides a helper:

typescript

import { assertRefSchemaVersionSupported } from '@helix-agents/core';

async resolve(ref) {
  if (ref.providerId !== this.providerId) { /* ... */ }
  assertRefSchemaVersionSupported(ref.schemaVersion, this.providerId, this.logger);
  // ... your normal payload validation
}

When a future schema change requires bumping to N+1:

Update the constants in core/workspace/utils/ref-schema-version.ts (CURRENT becomes N+1; PREVIOUS becomes N).
Stamp schemaVersion: N+1 on new refs in every built-in provider.
The N±1 window means one DEPLOY worth of forward/back compat. Two-step migrations (N → N+2) require a stop on N+1 first to ensure rollback safety.

The framework calls logger.info with 'workspace ref: migrating ref from vX to vY' when an explicit lower-than-current version comes through (operator forensics for rollouts).

Capability auto-injection extension point — known limitation (D7 round-4)

The auto-injection logic in core/workspace/tool-injection.ts is hard-coded for the five built-in capabilities (fs, shell, code, script, snapshot). A custom provider that wants to expose a NEW capability — say git (clone/pull/push tools) or network (proxied HTTP fetch) — has no extension point today. Adding a new capability requires:

Adding the capability key to WorkspaceCapabilityFlags in core/workspace/types/config.ts.
Adding a make<Capability>Tools(caps) factory in core/workspace/tool-injection.ts.
Wiring the new factory into injectWorkspaceTools()'s if (caps.<key>) chain.
Releasing core.

The script capability is the worked precedent: it added a script?: ScriptCapConfig flag to WorkspaceCapabilityFlags, a makeScriptTools(caps) factory, and the if (caps.script) tools.push(...makeScriptTools(caps)) line in injectWorkspaceTools() — exactly the four steps above. Use its diff as the template if you propose another built-in capability.

This is intentional for v1 — the capability surface is curated to keep tool naming + LLM behavior consistent across providers. Future versions may introduce a WorkspaceCapabilityInjector extension point that lets providers register their own tool factories. Until then, file an issue if you have a use case for a new capability and we'll evaluate adding it to the built-in set.

Wrapping host commands — the `CommandWrapper` seam

If you're building a POSIX-style provider that shells out, you don't have to reimplement the hardened subprocess shell. @helix-agents/workspace-posix-core exports SubprocessShell (the same one local-bash uses — allowlist, metachar/glob rejection, env-denylist, cwd-escape, byte caps, audit logging) and a small extension seam: the wrapper option, typed as CommandWrapper.

typescript

interface CommandWrapper {
  wrap(program: string, args: string[], ctx: { cwd: string }): { program: string; args: string[] };
}

SubprocessShell calls wrapper.wrap(program, args, { cwd }) immediately before spawn, after all the app-layer guards have already run and accepted the command. The default identityWrapper is a no-op. A provider that wants OS-level isolation injects a wrapper that rewrites (program, args) to run the original command under a sandbox launcher — and the shared guards still execute first, so isolation is genuine defense-in-depth, not a replacement for them.

local-sandbox is the worked precedent. It injects either a bwrap wrapper ({ program: 'bwrap', args: [...namespaceArgs, '--', program, ...args] }) or a seatbelt wrapper ({ program: 'sandbox-exec', args: ['-f', profilePath, program, ...args] }), selected by live backend detection. If your platform has its own command-launcher isolation (firejail, nsjail, a custom launcher), implement a CommandWrapper and pass it as wrapper to SubprocessShell rather than forking the shell — you inherit every app-layer defense for free. See packages/workspace-local-sandbox/src/isolation/ for the bwrap/seatbelt wrappers.

Reusing just the shared guards — `ShellGuard`

If your shell doesn't spawn a local subprocess at all (it execs into a container, a remote host, or some other execution surface), you can't reuse SubprocessShell — but you should still reuse the audited app-layer guards rather than re-implementing the allowlist / metachar / env-denylist checks (and their exact throw messages + rate-limited audit logging). @helix-agents/workspace-posix-core exports ShellGuard for exactly this: construct one with a logPrefix + errorLabel, call assertCommandAllowed(cmd, constraints) and assertEnvAllowed(env) on the HOST before you hand the command to your execution surface, and you get the same single-sourced security logic local-bash, local-sandbox, and docker all share. workspace-docker's DockerShell is the worked example — it runs ShellGuard host-side and then engine.exec(['sh','-c',cmd]) into the container.

Recreating remote / ephemeral isolation on `resolve()`

The docker provider is the worked example of a provider whose resolve() does NOT just re-open a local tmpdir — it recreates remote/ephemeral isolation state over a persisted ref. local-bash / local-sandbox resolve() only re-attach to an existing host tmpdir and rebuild an in-process shell; docker resolve() additionally re-probes the daemon LIVE and creates + starts a fresh container around the persisted tmpdir (the prior container is gone after a process boundary, so the persisted ref deliberately carries no containerId — only what's needed to rebuild: image, network policy, resource limits). The contract to copy:

The ref payload carries everything needed to RE-create the isolation unit, never a handle to a live one.
resolve() re-validates the persisted tmpdir (missing → WorkspaceEvictedError; invalid → WorkspaceFailedError, both with bare messages that don't echo the untrusted ref), then fails closed (re-probe the backend) before recreating.
The recreate path is the same hardened build as open(), factored into one helper so the two can't drift.

Confining a third-party SDK behind a test seam — the `DockerEngine` pattern

docker also demonstrates the interface-behind-which-the-SDK-hides pattern. dockerode is confined behind a thin DockerEngine interface (ping / ensureImage / createContainer / start / exec / stop / remove / inspect); nothing in the provider / workspace / shell imports dockerode directly. The real DockerodeEngine wraps the SDK; a FakeDockerEngine drives every unit test with NO daemon. If your provider wraps a heavyweight or daemon-dependent SDK (Docker, a cloud sandbox API, a remote-host SSH client), define the narrow interface your code actually calls, inject it via a provider option (default: the real impl), and ship a recording fake — your unit tests then run anywhere, and the gated integration suite exercises the real SDK against a live daemon. This is the same structural-test-double discipline as FakeSandbox below, applied at the SDK boundary.

Error model

Three error types you need to know:

`WorkspaceFailedError` — from `open()` / `resolve()`

Throw this when the workspace cannot be created or reconstructed. The registry transitions the entry to 'failed' state — subsequent tool calls fail fast with the same error.

typescript

import { WorkspaceFailedError } from '@helix-agents/core';

async open(config, session) {
  const result = await this.connectToBackend();
  if (!result.ok) {
    throw new WorkspaceFailedError(`Backend unavailable: ${result.error}`, {
      providerId: this.providerId,
      cause: result.cause,
    });
  }
  // ... happy path
}

Transient vs permanent errors (round-4 cluster C)

WorkspaceFailedError accepts a transient: true option. When set, the registry retries the open/resolve call with exponential backoff before transitioning the entry to 'failed':

typescript

// Known-transient cause: R2 timeout, container scheduling failure,
// network blip. Set transient: true so the registry retries.
throw new WorkspaceFailedError(`R2 read timed out after 30s`, {
  providerId: this.providerId,
  transient: true,
  cause: err,
});

// Permanent cause: capability mismatch, auth failure, config error.
// DO NOT set transient — retries cannot fix it.
throw new WorkspaceFailedError(`Workspace config has no R2 binding`, {
  providerId: this.providerId,
});

Auto-classification is unsafe — only the provider knows when an error is recoverable. Default is transient: false (no retry). Opt in per-throw for known-transient causes. The registry retries up to transientRetryAttempts times (default 3) with backoff capped at ~10s total.

`WorkspaceEvictedError` — from MODULE methods

Throw this from module method implementations (not from open / resolve!) when the underlying resource has been evicted and the framework should re-resolve via resolve(ref).

typescript

import { WorkspaceEvictedError } from '@helix-agents/core';

async readFile(path) {
  try {
    return await this.backend.readFile(path);
  } catch (err) {
    if (isEvictedError(err)) {
      throw new WorkspaceEvictedError(`Backend evicted`, { providerId: this.providerId });
    }
    throw err;
  }
}

The framework's withEvictionRetry (in tool-injection.ts) catches this, marks the registry entry as 'evicted', and the next tool call invokes provider.resolve(ref) to reattach. Useful for sandboxes that auto-evict after idle, tmpdirs that get cleaned, etc.

Don't throw WorkspaceEvictedError from open() or resolve() — the registry can't handle it cleanly there. Use WorkspaceFailedError instead.

Regular `Error` — from MODULE methods

Anything else propagates as a tool-error message to the LLM. The LLM sees the error message, can decide whether to retry, switch approaches, or surface to the user. Use plain Error (or a subclass) for "the operation failed but the workspace itself is fine."

Testing patterns

Structural test doubles, not `implements`

Don't make your test fake implements ISandbox (or whatever the upstream interface is). That forces you to fill in every method, even ones you don't use. Instead, build a test double that covers only the methods your adapter calls and cast it via as unknown as TSomeInterface:

typescript

// In your test:
const fake = new FakeBackend(); // not `implements TBackend`
const provider = new MyProvider({ backend: fake as unknown as TBackend });

The cast is local, explicit, and only applies at the boundary. If your adapter starts using a new method, the test fails with a clear "method not implemented" error from the fake, prompting you to add it.

Reference: `FakeSandbox` from runtime-cloudflare

The @helix-agents/runtime-cloudflare/testing subpath exports FakeSandbox, an in-memory ISandbox subset used by CloudflareSandboxWorkspaceProvider's tests. It's a good worked example — covers fs (Map-backed), exec/code (canned responses), backups (Map-backed). About 600 lines.

Worked example: `MyProvider`

A minimal provider wrapping a Map-backed filesystem. Demonstrates the full contract.

typescript

// my-provider.ts
import type {
  OpenedWorkspace,
  SessionRef,
  Workspace,
  WorkspaceProvider,
  WorkspaceRef,
  WorkspaceId,
  FileSystem,
  FileEntry,
  FileStat,
  GrepOptions,
  GrepResult,
  GrepMatch,
} from '@helix-agents/core';

// 1. Config type with discriminator.
export interface MyProviderConfig {
  readonly kind: 'my-provider';
  /** Optional: scope for naming inside your backend. */
  readonly namespace?: string;
}

// 2. The fs adapter.
class MyFileSystem implements FileSystem {
  constructor(private readonly files: Map<string, Uint8Array>) {}

  async readFile(path: string): Promise<Uint8Array> {
    const bytes = this.files.get(path);
    if (!bytes) throw new Error(`MyFileSystem: file not found: ${path}`);
    return bytes;
  }

  async writeFile(path: string, data: Uint8Array | string): Promise<void> {
    const bytes = typeof data === 'string' ? new TextEncoder().encode(data) : data;
    this.files.set(path, bytes);
  }

  async stat(path: string): Promise<FileStat> {
    const bytes = this.files.get(path);
    if (!bytes) throw new Error(`MyFileSystem: not found: ${path}`);
    return { path, type: 'file', size: bytes.length };
  }

  async ls(path: string): Promise<FileEntry[]> {
    const prefix = path.endsWith('/') ? path : path + '/';
    return Array.from(this.files.keys())
      .filter((k) => k.startsWith(prefix))
      .map((k) => ({
        name: k.slice(prefix.length).split('/')[0],
        path: k,
        type: 'file' as const,
        size: this.files.get(k)!.length,
      }));
  }

  async glob(pattern: string): Promise<string[]> {
    const re = new RegExp(pattern.replace(/\*/g, '.*'));
    return Array.from(this.files.keys()).filter((k) => re.test(k));
  }

  async grep(pattern: string, opts?: GrepOptions): Promise<GrepResult> {
    const re = new RegExp(pattern, opts?.ignoreCase ? 'i' : '');
    const decoder = new TextDecoder();
    // `FileSystem.grep` returns a single envelope: the per-line hits live in
    // `matches` (GrepMatch[]), and the two skipped-path lists let the LLM
    // distinguish "no matches" from "we deliberately skipped a file".
    const matches: GrepMatch[] = [];
    for (const [path, bytes] of this.files) {
      if (opts?.path && !path.startsWith(opts.path)) continue;
      const lines = decoder.decode(bytes).split('\n');
      for (let i = 0; i < lines.length; i++) {
        if (re.test(lines[i])) {
          matches.push({ path, lineNumber: i + 1, line: lines[i] });
          if (opts?.maxResults && matches.length >= opts.maxResults) {
            return { matches, skippedPaths: [], skippedBinaryPaths: [] };
          }
        }
      }
    }
    return { matches, skippedPaths: [], skippedBinaryPaths: [] };
  }

  async rm(path: string): Promise<void> {
    if (!this.files.delete(path)) throw new Error(`MyFileSystem: not found: ${path}`);
  }

  async mkdir(): Promise<void> {
    // Implicit — directories aren't tracked separately in this toy impl.
  }
}

// 3. The Workspace aggregator.
//
// `Workspace` has one required field (`id`) plus five OPTIONAL module slots —
// populate only the ones your provider supports:
//   readonly fs?: FileSystem;
//   readonly shell?: Shell;
//   readonly code?: CodeInterpreter;
//   readonly script?: CodeInterpreter;   // ephemeral V8-isolate script runner
//   readonly snapshot?: Snapshotter;
// This example populates only `fs`.
class MyWorkspace implements Workspace {
  readonly id: WorkspaceId;
  readonly fs: FileSystem;

  constructor(id: string, fs: FileSystem) {
    this.id = id as WorkspaceId;
    this.fs = fs;
  }

  async close(): Promise<void> {
    // No-op — Map garbage-collects when references drop.
  }
}

// 4. The provider.
export class MyProvider implements WorkspaceProvider<MyProviderConfig> {
  readonly providerId = 'my-provider';

  // External-storage backing — keyed by namespace so resolve() reattaches.
  private static stores = new Map<string, Map<string, Uint8Array>>();

  async open(config: MyProviderConfig, session: SessionRef): Promise<OpenedWorkspace> {
    const namespace = config.namespace ?? session.sessionId;
    let store = MyProvider.stores.get(namespace);
    if (!store) {
      store = new Map();
      MyProvider.stores.set(namespace, store);
    }
    const fs = new MyFileSystem(store);
    const ws = new MyWorkspace(namespace, fs);
    const ref: WorkspaceRef = {
      providerId: this.providerId,
      ref: { namespace },
      capabilities: { fs: true },
      schemaVersion: 2, // Stamp the current N — required by the N±1 contract
    };
    return { ws, ref };
  }

  async resolve(ref: WorkspaceRef): Promise<Workspace> {
    if (ref.providerId !== this.providerId) {
      throw new Error(`MyProvider: refusing to resolve foreign provider ref`);
    }
    const payload = ref.ref as { namespace?: string } | undefined;
    if (!payload?.namespace) {
      throw new Error(`MyProvider: ref payload missing namespace`);
    }
    let store = MyProvider.stores.get(payload.namespace);
    if (!store) {
      // Could throw here if you want to fail; or auto-create as we do.
      store = new Map();
      MyProvider.stores.set(payload.namespace, store);
    }
    const fs = new MyFileSystem(store);
    return new MyWorkspace(payload.namespace, fs);
  }
}

Wire it like any other provider:

typescript

const executor = new JSAgentExecutor(/* ... */, {
  workspaceProviders: new Map([
    ['my-provider', new MyProvider()],
  ]),
});

Reference: existing providers

Read these for full real-world examples:

@helix-agents/workspace-memory — simplest provider. fs only. ~150 lines.
@helix-agents/workspace-posix-core — NOT a provider; the shared POSIX plumbing (SubprocessShell, TmpdirFileSystem, ref-payload validation, and the CommandWrapper seam) that local-bash and local-sandbox both build on. Read it to see how the shell/fs modules and path-safety guards are implemented once and reused.
@helix-agents/workspace-local-bash — POSIX tmpdir + subprocess shell (composes workspace-posix-core). ~600 lines.
@helix-agents/workspace-local-sandbox — same POSIX fs/shell as local-bash PLUS an OS-level isolation wrapper (seatbelt / bwrap) injected via the CommandWrapper seam; fail-closed when no backend. The worked precedent for wrapping host commands (see below).
@helix-agents/workspace-docker — host-side TmpdirFileSystem (bind-mounted into a container) + a DockerShell that runs the shared ShellGuard then execs into the container via dockerode. The worked precedent for (a) resolve() recreating ephemeral isolation over a persisted ref, (b) confining a third-party SDK behind a DockerEngine test seam, and (c) reusing just the ShellGuard when you don't spawn locally.
runtime-cloudflare/src/workspaces/filestore — Cloudflare DO SQLite filestore. ~400 lines.
runtime-cloudflare/src/workspaces/sandbox — full Linux container with all 4 modules. ~1500 lines including tests.

Source references

Provider contract: packages/core/src/workspace/types/provider.ts
Workspace + module interfaces: packages/core/src/workspace/types/
Registry semantics: packages/core/src/workspace/registry.ts
Tool injection + withEvictionRetry: packages/core/src/workspace/tool-injection.ts
Error types: packages/core/src/workspace/errors.ts

Building a Provider ​

When you'd build your own ​

The WorkspaceProvider<TConfig> contract ​

The WorkspaceConfig discriminator ​

The lifecycle ​

What goes in the ref payload ​

Per-session state contract — providers MUST be stateless across sessions ​

What goes where ​

Bad pattern (cross-session leak) ​

Good pattern (per-session closure) ​

Module construction strategies ​

Strategy A: always construct everything (back-compat default) ​

Strategy B: skip unused modules (D3 round-4) ​

WorkspaceCapabilityFlags advertisement on the ref ​

Ref schema versioning (D4 round-4) ​

Capability auto-injection extension point — known limitation (D7 round-4) ​

Wrapping host commands — the CommandWrapper seam ​

Reusing just the shared guards — ShellGuard ​

Recreating remote / ephemeral isolation on resolve() ​

Confining a third-party SDK behind a test seam — the DockerEngine pattern ​

Error model ​

WorkspaceFailedError — from open() / resolve() ​

Transient vs permanent errors (round-4 cluster C) ​

WorkspaceEvictedError — from MODULE methods ​

Regular Error — from MODULE methods ​

Testing patterns ​

Structural test doubles, not implements ​

Reference: FakeSandbox from runtime-cloudflare ​

Worked example: MyProvider ​

Reference: existing providers ​

Source references ​

Building a Provider

When you'd build your own

The `WorkspaceProvider<TConfig>` contract

The `WorkspaceConfig` discriminator

The lifecycle

What goes in the ref payload

Per-session state contract — providers MUST be stateless across sessions

What goes where

Bad pattern (cross-session leak)

Good pattern (per-session closure)

Module construction strategies

Strategy A: always construct everything (back-compat default)

Strategy B: skip unused modules (D3 round-4)

`WorkspaceCapabilityFlags` advertisement on the ref

Ref schema versioning (D4 round-4)

Capability auto-injection extension point — known limitation (D7 round-4)

Wrapping host commands — the `CommandWrapper` seam

Reusing just the shared guards — `ShellGuard`

Recreating remote / ephemeral isolation on `resolve()`

Confining a third-party SDK behind a test seam — the `DockerEngine` pattern

Error model

`WorkspaceFailedError` — from `open()` / `resolve()`

Transient vs permanent errors (round-4 cluster C)

`WorkspaceEvictedError` — from MODULE methods

Regular `Error` — from MODULE methods

Testing patterns

Structural test doubles, not `implements`

Reference: `FakeSandbox` from runtime-cloudflare

Worked example: `MyProvider`

Reference: existing providers

Source references