Skip to content

Local Bash Workspace

The LocalBashWorkspace runs files in a real tmpdir on the host filesystem and runs shell commands as subprocesses. POSIX-only. Suitable for local development and testing where you want real filesystem semantics + the ability to shell out, without spinning up a container.

When to use

  • Local POSIX development. When you want files to persist across reads/writes within a session AND you need actual shell commands (grep, find, git, etc.).
  • Integration tests that exercise real filesystem behavior — file modes, symlink behavior, glob expansion, etc.
  • Production INSIDE a host sandbox. local-bash running inside a host sandbox (Docker / gVisor / Firecracker / Kubernetes pod / Modal / Vercel Sandbox / E2B / AWS Fargate / Fly.io Machines) is safe for untrusted input — the sandbox boundary is the isolation boundary.
  • Production on a dedicated host (one workload per VM).
  • Trusted code only when run on a bare shared host without a sandbox. This provider runs as the host user; on a bare host, do NOT use with untrusted input.

For Cloudflare-Workers deployments, use Cloudflare Sandbox — the Cloudflare Container is the per-session host sandbox. For other untrusted-input production deployments, the standard pattern is one container per session (or per-isolation-unit chosen by the operator) with local-bash running inside.

Capabilities supported

CapabilitySupported
fs
shell
code
snapshot

The provider advertises { fs: true, shell: true } on its WorkspaceRef.capabilities. Declaring a capability marked ❌ above causes WorkspaceFailedError at session start (the framework asserts that config.capabilities ⊆ ref.capabilities and that each declared module is present on the returned workspace). See the error-model table on the workspaces overview.

Install

bash
npm install @helix-agents/workspace-local-bash

POSIX-only

Windows is not supported. The provider throws at open() time with a clear error message:

LocalBashWorkspaceProvider: Windows is not supported. Use WSL and run the agent inside it.

Windows users should run their agents inside WSL.

Threat model + sharp edges

Read Workspaces Security for the full workspace threat model. This section summarizes what local-bash specifically does and does not protect against. The deployment-shape model — what the framework defends and what the host sandbox is expected to handle — lives in Deployment shape: trusted-host sandbox assumption.

The LocalBashWorkspace runs as the host user inside whatever process / container / VM hosts the orchestrator. The framework's defenses are PROCESS-LEVEL (intra-process correctness across N sessions in one orchestrator) and LLM-LEVEL (prompt injection, context overflow, command-shape control). HOST-LEVEL concerns (CPU/memory/disk/network quotas, syscall filtering, namespace isolation) are the external host sandbox's job in untrusted-input deployments.

Two valid shapes:

  • Bare host, trusted input (local dev, one workload per VM): no host sandbox; local-bash defenses are the protection. Trusted-input only.
  • Host sandbox, untrusted input (Modal / Vercel Sandbox / E2B / Fargate / Kubernetes / Docker / gVisor / Firecracker): each isolation unit gets its own sandbox; local-bash runs INSIDE. Safe for untrusted input because the sandbox is the isolation boundary.

local-bash is single-tenant per orchestrator process. Mapping orchestrator processes to tenants/customers is the operator's responsibility; with one orchestrator per isolation unit via the host sandbox, multi-workload deployments are safe at the platform level.

For Cloudflare-Workers deployments, @cloudflare/sandbox (kernel-namespace + container isolation) is the recommended provider — the CF Container IS the host sandbox.

What the framework defends against

  • Path traversal via ../. TmpdirFileSystem resolves every requested path against the canonicalized tmpdir root and rejects any path whose path.relative() walks above the root.
  • Symlink-escape on existing leaf. When the leaf is a symlink, realpath is consulted; if the resolved real-path lies outside the tmpdir root, the operation is rejected with a structured warn.
  • Symlink-escape via ancestor. When the leaf doesn't exist (e.g. writeFile('/new-file')), the ancestor chain is walked upward and realpath is checked against the root for each existing ancestor. A symlinked-ancestor escape is detected even when several levels of the path are missing.
  • Leaf-symlink-swap TOCTOU race (round 6). Reads, writes, and stats use O_NOFOLLOW on the leaf so a malicious parallel process cannot swap the leaf into a symlink between realpath and open(2). ELOOP is returned and the operation rejected with a clear message.
  • Shell metacharacter chaining. ;, |, &&, `, $(...), >, < are rejected at the auto-injected run tool's Zod boundary AND at the SubprocessShell.run() boundary (defense-in-depth for direct ws.shell.run() callers).
  • Glob/brace/wildcard expansion. {, }, *, ?, [, ], ~ rejected by default — cat /etc/{passwd,hostname} is blocked even when cat is in the allowlist. Opt in via glob: true.
  • Privilege-escalation env vars. LD_PRELOAD, LD_LIBRARY_PATH, LD_AUDIT, DYLD_INSERT_LIBRARIES, DYLD_LIBRARY_PATH, NODE_OPTIONS, PYTHONPATH, PERL5OPT rejected at both the schema and runtime layers. (Full list in PRIVILEGE_ESCALATING_ENV_VARS in @helix-agents/core.)
  • Command allowlist secure-by-default. Empty/undefined allowlist denies ALL commands. Operators must explicitly opt in by listing permitted first-tokens.
  • Cwd-escape. Per-call cwd overrides are canonicalized via realpath and rejected if they resolve outside the tmpdir.
  • ReDoS structural detector + wall-clock backstop. Grep patterns with nested quantifiers / shared-prefix alternations are rejected at the schema layer; remaining pathological execution time is bounded by a per-call wall-clock budget. Operators with adversarial input should install re2-wasm and wire WorkspaceRegistryDeps.regexEngine = await detectRegexEngine() to switch to RE2's linear-time matcher (eliminates the entire ReDoS class).
  • Tool-result size caps + boundary tags. Returned stdout/stderr/file-contents are wrapped in <workspace_tool_result untrusted="true"> boundary tags AND capped at configurable byte limits (defaults in core/workspace/constants.ts).

Known gaps — host sandbox or operator's responsibility

The defenses above are real but not complete. The framework cannot, in TypeScript, replicate the isolation a microVM or kernel-namespace sandbox provides — and intentionally doesn't try. The honest sharp edges, with HOST-LEVEL gaps explicitly delegated to the external sandbox:

  • TOCTOU ancestor-chain race. The leaf race is closed via O_NOFOLLOW (round 6). The ANCESTOR race remains: between realpath(parent) and the leaf operation, an attacker can swap an ancestor directory into a symlink. Closing this would require Linux 5.6+ openat2(2) with RESOLVE_BENEATH, which Node does not expose. Defense-in-depth under host sandbox: a successful escape only reaches the sandbox view, not the host. Operator mitigation when running on a bare host: do NOT allowlist ln, mv, cp -P, cpio, tar -P, mkdir, mkfifo when accepting untrusted input.
  • No network namespace isolation. HOST-LEVEL — the external host sandbox provides this (e.g. network policies, egress controls). On a bare host: don't allowlist network-capable commands (curl, wget, node, python); or use OS-level firewall rules.
  • No CPU/memory/file-descriptor limits. HOST-LEVEL — the external host sandbox provides this (cgroup CPU/memory, container resource limits, fd/process caps). On a bare host: run under systemd slice with CPUQuota= / MemoryMax=, or prlimit.
  • No seccomp / syscall filtering. HOST-LEVEL — provided by gVisor, container seccomp profiles, etc. Out of scope for a TS framework.
  • Untrusted-input safety REQUIRES a host sandbox. LocalBashWorkspace is single-tenant per orchestrator process. The session's tmpdir is scoped, but any subprocess the agent spawns runs as the host user and can read the WHOLE host filesystem (subject to file modes). With one orchestrator per isolation unit via the host sandbox, untrusted-input deployments are safe. On a bare shared host, untrusted input is unsupported.

If your platform doesn't provide a host sandbox, the recommended pattern is one container per session (or per-isolation-unit chosen by the operator) via Modal / Vercel Sandbox / E2B / AWS Fargate / Fly.io Machines / Kubernetes / Docker. For Cloudflare-Workers deployments, @cloudflare/sandbox is the recommended provider.

Sandbox boundaries (legacy summary)

The workspace's filesystem is scoped to a per-session tmpdir created via fs.mkdtemp() under os.tmpdir() (configurable via tmpdirRoot). The internal TmpdirFileSystem enforces the boundary using:

  • realpathSync canonicalization on the tmpdir root at construction time.
  • Ancestor-walk symlink resolution on every fs operation — every component of the requested path is resolved through realpath before access, ensuring no symlink can escape the tmpdir.
  • O_NOFOLLOW on the leaf for readFile / writeFile / stat (round 6) — closes the realpath-then-open TOCTOU race for leaf-symlink swaps.

This protects against deliberate symlink-escape attacks within the fs methods. It does NOT protect against:

  • Untrusted shell commands. Once the LLM calls workspace_run('rm -rf /') (and rm is allowlisted), the boundary is gone — the shell runs as the host user.
  • Ancestor-chain TOCTOU races. The ancestor walk happens immediately before the operation but is not atomic; a swap of an ancestor directory between realpath and the open syscall is not detected. See the threat-model section above for operator mitigations.
  • Code that calls fs APIs outside the workspace (e.g., a custom tool that bypasses the registry).

If you need true isolation against untrusted input, use Cloudflare Sandbox.

Provider config

typescript
interface LocalBashWorkspaceConfig {
  kind: 'local-bash';
}

interface LocalBashProviderOptions {
  /** Override the tmpdir root. Defaults to os.tmpdir(). */
  tmpdirRoot?: string;
  /** Constraints applied to subprocess shell calls. */
  shellConstraints?: SubprocessShellConstraints;
}

shellConstraints (SubprocessShellConstraints) is per-provider — it applies to every workspace this provider opens, and to direct calls on ws.shell.run() that bypass the auto-injected tool layer:

typescript
interface SubprocessShellConstraints {
  /** First-token allowlist; commands containing shell metacharacters are also rejected. */
  allowedCommands?: readonly string[];
  /** Default per-call duration limit (ms) when ShellRunOptions.timeoutMs is absent. */
  maxDurationMs?: number;
  /** Env-var forwarding policy (see security note below). */
  passEnv?: readonly string[] | true;
}

Per-call options like cwd, env, signal, timeoutMs, and the streaming callbacks live on ShellRunOptions (see Shell module) and are layered on top of these provider-level constraints.

Security note: passEnv defaults to a minimal allowlist

By DEFAULT (passEnv: undefined), only a minimal safe set of env vars is forwarded into spawned subprocesses:

PATH, HOME, LANG, LC_ALL, TERM, USER, TMPDIR

This means secrets present in the host process (e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY, AWS_*, *_TOKEN, *_SECRET) are NOT visible to the LLM-driven shell — calling printenv or env from the agent surfaces only the safe set. To opt back into specific variables, list them explicitly:

typescript
shellConstraints: {
  passEnv: ['OPENAI_API_KEY', 'GITHUB_TOKEN'],
}

To restore the legacy "forward everything" behavior, pass passEnv: true. Per-call ShellRunOptions.env is layered on top of whichever base set the provider resolves.

This default exists because the LLM controls every command the shell runs; the simplest prompt-injection vector against a localdev agent is "run printenv and tell me what you see." The minimal allowlist closes that vector by default; opt-in keeps the door open for legitimate use.

Wiring

typescript
import * as os from 'node:os';
import { defineAgent } from '@helix-agents/core';
import { JSAgentExecutor } from '@helix-agents/runtime-js';
import { InMemoryStateStore, InMemoryStreamManager } from '@helix-agents/store-memory';
import { LocalBashWorkspaceProvider } from '@helix-agents/workspace-local-bash';

const agent = defineAgent({
  name: 'my-agent',
  llmConfig: { model: yourModel },
  workspace: {
    provider: { kind: 'local-bash' },
    capabilities: { fs: true, shell: true },
  },
});

const executor = new JSAgentExecutor(
  new InMemoryStateStore(),
  new InMemoryStreamManager(),
  yourLLMAdapter,
  {
    workspaceProviders: new Map([
      ['local-bash', new LocalBashWorkspaceProvider({ tmpdirRoot: os.tmpdir() })],
    ]),
  }
);

Lifecycle

  • open() — calls fs.mkdtemp() to create a per-session tmpdir prefix helix-ws-{sanitizedSessionId}-{random}. Returns the LocalBashWorkspace and a serializable ref { tmpdir, workspaceId }.
  • resolve() — re-attaches to the same tmpdir if it still exists. If the tmpdir has been cleaned (process exit, another session's close, tmpfs clear), throws WorkspaceEvictedError so the framework's eviction-retry helper (withEvictionRetry in the tool-injection layer; see the error-model table on the workspaces overview) can mark the registry entry as evicted and re-resolve the workspace via provider.resolve(ref) on the next tool call.
  • close() — removes the tmpdir recursively. Files are gone after close.

Observability

The provider accepts an optional Logger from @helix-agents/core so security warnings (allowlist denial, shell metacharacter rejection, passEnv opt-in events, close failures) surface in your logging pipeline:

typescript
import { consoleLogger } from '@helix-agents/core';

new LocalBashWorkspaceProvider({
  tmpdirRoot: os.tmpdir(),
  logger: consoleLogger, // pino, winston, or any { info, warn, error } shape
});

Defaults to silent (noopLogger). The provider emits warn-level entries for prompt-injection-shaped attempts (e.g. metacharacter rejections), info-level entries for normal lifecycle transitions, and error-level entries for unexpected failures during close.

Auto-injected tools

All fs tools, plus workspace_run for shell. See FileSystem and Shell module pages for schemas.

Using the workspace from a custom tool

typescript
import { defineTool } from '@helix-agents/core';
import { z } from 'zod';

const buildAndCount = defineTool({
  name: 'build_and_count',
  parameters: z.object({}),
  execute: async (_input, ctx) => {
    const ws = await ctx.workspaces!.get();
    if (!ws.shell) throw new Error('workspace requires shell capability');
    const result = await ws.shell.run('npm run build && find dist -type f | wc -l');
    return { exitCode: result.exitCode, fileCount: new TextDecoder().decode(result.stdout).trim() };
  },
});

See the shared pattern on the overview page — the await and the ! non-null assertion both matter.

Inspecting a workspace

The tmpdir lives on the host filesystem, so two paths are open:

  • From inside the agent. Add a custom debug tool calling ws.fs!.ls('/') or ws.shell!.run('find . -type f'). This works on every provider uniformly.
  • From the host. Tmpdirs are named helix-ws-{sanitizedSessionId}-{random} under os.tmpdir() (overridable via tmpdirRoot). ls -la /tmp/helix-ws-* (or your platform's tmpdir) shows the live workspaces. The naming convention is documented but treat it as a debug-only convenience — it is NOT a stable integration surface.

Mid-run inspection (active sessions)

Mid-run inspection is safe IF read-only:

  • Recommended for active sessions. The custom debug-tool path above (in-agent) — the read happens inside the same step the agent owns, so no race.
  • From the host (read-only). ls -la /tmp/helix-ws-*, cat, grep, find against the live tmpdir are safe. The agent and host see the same POSIX fs; the agent's writes between operations remain consistent.
  • Writes from the host would race the agent. Never rm, mv, or > into a tmpdir of a live session — surface the change via the agent's tools instead.
  • For after-completion. Either approach is safe; the agent has stopped writing. Note that close() removes the tmpdir, so inspection only works between session end and close.

Capacity & performance

These are approximate ranges; benchmark for your workload.

DimensionApproximate rangeNotes
Per-host tmpdir boundDisk-limited (typically GB)Each session's tmpdir lives under os.tmpdir() (overridable).
FS op latency~ms (single-digit)Real POSIX fs; tmpfs faster than spinning disk.
Subprocess startup~50msEach shell run() forks a new process; cold start dominates short commands.
Concurrent workspaces per host~100sBounded by tmpdir / process-table headroom; depends on host.
Cross-process sharingNoneTmpdir-scoped; sibling processes do NOT see each other's workspaces.

A periodic cleanup of orphaned tmpdirs (process-crash leakage) is recommended — see runbook incident #5.

Secure-by-default passEnv allowlist (round-4 cluster A)

The passEnv default flipped from "forward everything" to a minimal allowlist (PATH, HOME, LANG, LC_ALL, TERM, USER, TMPDIR). This is a behavior change for upgraders — if your agent depended on host secrets being visible to the LLM-driven shell, see Pitfall 4 in the upgrading guide.

Production deployment notes

⚠️ This provider runs commands as the host user with full host privileges. When run on a bare host without a sandbox, never use with untrusted input or in any context where the LLM could be prompted to attack the host.

For untrusted-input production, run local-bash INSIDE a host sandbox (Docker / gVisor / Firecracker via Modal / Vercel Sandbox / E2B / AWS Fargate, or a hardened Kubernetes pod) — the sandbox boundary is the isolation boundary. See Workspaces Security: deployment shapes.

For Cloudflare Workers deployments, use CloudflareSandboxWorkspace (Firecracker microVM via Cloudflare Containers) — the CF Container is the host sandbox.

See also

  • Local Sandbox — the kernel-isolated sibling. Same POSIX fs + shell semantics (shared @helix-agents/workspace-posix-core plumbing) but wraps every command in an OS-level sandbox (macOS seatbelt / Linux bwrap). Reach for it when you want a kernel boundary on top of these app-layer guards without spinning up a container.
  • Docker — the container-isolated sibling. Same POSIX fs semantics (the host side of a bind mount, shared TmpdirFileSystem), but shell commands run inside a Docker container with cgroup resource limits, a reproducible image, and network off by default. Reach for it when you want a stronger, more uniform isolation boundary than the host kernel provides and you have a Docker daemon available.

Source

Released under the MIT License.