Skip to content

Cloudflare Workflows Runtime

The Cloudflare Workflows runtime executes agents using Cloudflare Workflows for durable execution with D1 for state storage and separate Durable Objects for stream management.

When to Use Workflows

Choose the Workflows runtime when you need step-level durability, automatic retries, or want to share D1 state with other services. If your agents require heavy streaming (>100 chunks), consider the Durable Objects runtime instead.

Architecture

mermaid
graph TB
    subgraph Edge ["Edge Location (Global)"]
        subgraph Worker ["Cloudflare Worker"]
            W1["HTTP endpoints<br/>CloudflareAgentExecutor<br/>Starts Workflows"]
        end

        Worker --> Workflow

        subgraph Workflow ["Cloudflare Workflow"]
            WF1["Agent execution steps<br/>LLM calls<br/>Tool execution"]
        end

        Workflow --> D1
        Workflow --> DO

        D1["<b>D1 Database</b><br/>Agent state<br/>Messages"]
        DO["<b>Durable Object</b><br/>Stream events<br/>Real-time streaming"]
    end

Prerequisites

  • Cloudflare account with Workers Paid plan
  • Wrangler CLI: npm install -g wrangler
  • D1 database for state storage
  • Durable Objects for streaming

Installation

bash
npm install @helix-agents/runtime-cloudflare @helix-agents/store-cloudflare

Setup Guide

1. Configure wrangler.toml

toml
name = "agent-worker"
main = "src/index.ts"
compatibility_date = "2024-01-01"

# D1 Database for state
[[d1_databases]]
binding = "DB"
database_name = "agent-state"
database_id = "your-database-id"

# Durable Object for streaming
[durable_objects]
bindings = [
  { name = "STREAM_MANAGER", class_name = "StreamManagerDO" }
]

[[migrations]]
tag = "v1"
new_classes = ["StreamManagerDO"]

# Workflow binding
[[workflows]]
name = "AGENT_WORKFLOW"
class_name = "AgentWorkflow"

2. Use Programmatic Migrations

The D1StateStore uses programmatic migrations that are automatically applied. Call runMigration() on startup:

typescript
import { runMigration } from '@helix-agents/store-cloudflare';

// Run migrations on startup (safe to call every request - no-op if already migrated)
await runMigration(env.DB);

The framework creates tables automatically with session-centric naming (all prefixed with __agents_). The current schema is V9:

sql
-- Core state table (V1 baseline + V8 + V9)
CREATE TABLE __agents_states (
  session_id TEXT PRIMARY KEY,
  agent_type TEXT NOT NULL,
  stream_id TEXT NOT NULL,
  status TEXT NOT NULL DEFAULT 'active',
  step_count INTEGER DEFAULT 0,
  custom_state TEXT,
  output TEXT,
  error TEXT,
  failure_reason TEXT,                  -- V7 cascade discriminator
  parent_session_id TEXT,
  aborted INTEGER DEFAULT 0,
  abort_reason TEXT,
  user_id TEXT, tags TEXT, metadata TEXT,
  pending_client_tool_calls TEXT,       -- V7 HITL
  completed_client_tool_calls TEXT,     -- V8 idempotency
  suspension_context TEXT,              -- V9 stateless suspension
  expires_at INTEGER,                   -- packed inside suspension_context JSON
  created_at INTEGER NOT NULL,
  updated_at INTEGER NOT NULL
);

-- Messages table (separated for O(1) append)
CREATE TABLE __agents_messages (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  session_id TEXT NOT NULL,
  sequence INTEGER NOT NULL,
  message TEXT NOT NULL,
  created_at INTEGER NOT NULL,
  UNIQUE(session_id, sequence)
);

-- Sub-session refs (V4 adds mode + name for persistent sub-agents)
CREATE TABLE __agents_sub_session_refs (
  session_id TEXT NOT NULL,
  sub_session_id TEXT NOT NULL,
  agent_type TEXT NOT NULL,
  parent_tool_call_id TEXT NOT NULL,
  status TEXT NOT NULL DEFAULT 'active',
  mode TEXT,                            -- V4: 'persistent' | 'ephemeral'
  name TEXT,                            -- V4: persistent child name
  remote_json TEXT,                     -- V2: remote agent metadata
  started_at INTEGER NOT NULL,
  completed_at INTEGER,
  PRIMARY KEY (session_id, sub_session_id)
);

-- Run tracking (__agents_runs is part of the V1 baseline)
CREATE TABLE __agents_runs (
  run_id TEXT PRIMARY KEY,
  session_id TEXT NOT NULL,
  turn INTEGER NOT NULL,
  status TEXT NOT NULL,
  step_count INTEGER NOT NULL DEFAULT 0,
  started_at INTEGER NOT NULL,
  completed_at INTEGER
);

suspension_context packs suspendedAwaitingChildren, suspendedStepId, tracingContext, and expiresAt. A partial index on json_extract(suspension_context, '$.expiresAt') enables operator cleanup of abandoned sessions. See docs/storage/cloudflare.md for the complete schema and migration progression V1→V9.

3. Create the Durable Object

typescript
// src/stream-manager-do.ts
import { DurableObject } from 'cloudflare:workers';

export class StreamManagerDO extends DurableObject {
  private chunks: Map<string, StreamChunk[]> = new Map();

  async write(streamId: string, chunk: StreamChunk): Promise<void> {
    const chunks = this.chunks.get(streamId) ?? [];
    chunks.push(chunk);
    this.chunks.set(streamId, chunks);

    // Notify connected clients via WebSocket
    this.ctx.getWebSockets().forEach((ws) => {
      ws.send(JSON.stringify(chunk));
    });
  }

  async read(streamId: string, fromOffset: number): Promise<StreamChunk[]> {
    const chunks = this.chunks.get(streamId) ?? [];
    return chunks.slice(fromOffset);
  }

  async fetch(request: Request): Promise<Response> {
    // WebSocket upgrade for real-time streaming
    if (request.headers.get('Upgrade') === 'websocket') {
      const [client, server] = Object.values(new WebSocketPair());
      this.ctx.acceptWebSocket(server);
      return new Response(null, { status: 101, webSocket: client });
    }
    return new Response('Expected WebSocket', { status: 400 });
  }
}

4. Create the Workflow

The recommended pattern is to use createWorkflowRunner (or call runAgentWorkflow directly), which encapsulates the v7 stateless suspension contract:

typescript
// src/workflows/agent-workflow.ts
import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from 'cloudflare:workers';
import {
  AgentRegistry,
  runAgentWorkflow,
  type AgentWorkflowInput,
  type AgentWorkflowResult,
} from '@helix-agents/runtime-cloudflare';
import { D1StateStore, DOStreamManager } from '@helix-agents/store-cloudflare';
import { VercelAIAdapter } from '@helix-agents/llm-vercel';
import { registry } from '../registry.js';

export class AgentWorkflow extends WorkflowEntrypoint<Env, AgentWorkflowInput> {
  async run(
    event: WorkflowEvent<AgentWorkflowInput>,
    step: WorkflowStep
  ): Promise<AgentWorkflowResult> {
    return runAgentWorkflow(event, step, {
      stateStore: new D1StateStore({ database: this.env.DB }),
      streamManager: new DOStreamManager(this.env.STREAM_MANAGER),
      llmAdapter: new VercelAIAdapter({
        /* ... */
      }),
      registry,
      workflowBinding: this.env.AGENT_WORKFLOW,
    });
  }
}

runAgentWorkflow returns early at every HITL boundary with one of three 'suspended_*' statuses (see Stateless Suspension Model below). The executor restarts the workflow with mode: 'resume' when the user submits results or otherwise unblocks the run.

Recording usage (optional)

The CFW Workflows runtime records usage only when you pass a usageStore into the workflow dependencies — the workflow body runs in a separate execution context from the executor, so the executor's usageStore powers only the read side (getUsageRollup). To record token, tool, custom (ctx.recordUsage), and sub-agent/companion usage, add a shared/aggregating store (a D1UsageStore over the same D1 binding):

typescript
import { D1UsageStore } from '@helix-agents/store-cloudflare';

return runAgentWorkflow(event, step, {
  stateStore: new D1StateStore({ database: this.env.DB }),
  streamManager: new DOStreamManager(this.env.STREAM_MANAGER),
  llmAdapter: new VercelAIAdapter({
    /* ... */
  }),
  registry,
  workflowBinding: this.env.AGENT_WORKFLOW,
  // Records usage to the SHARED D1 store so `getUsageRollup({ includeSubAgents: true })`
  // can surface sub-agent / persistent-companion cost on the parent.
  usageStore: new D1UsageStore({ database: this.env.DB }),
});

Without usageStore, no usage is recorded (rollups are empty). Configure the same store kind on your CloudflareAgentExecutor for the read side. See Usage Tracking.

5. Create the Worker Entry

typescript
// src/index.ts
import { CloudflareAgentExecutor } from '@helix-agents/runtime-cloudflare';
import { D1StateStore, DOStreamManager } from '@helix-agents/store-cloudflare';
import { AgentWorkflow } from './workflows/agent-workflow';
import { StreamManagerDO } from './stream-manager-do';

export { AgentWorkflow, StreamManagerDO };

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const url = new URL(request.url);

    // Create executor
    const executor = new CloudflareAgentExecutor({
      workflowBinding: env.AGENT_WORKFLOW,
      stateStore: new D1StateStore(env.DB),
      streamManager: new DOStreamManager(env.STREAM_MANAGER),
    });

    // POST /agent/execute - Start new execution
    if (url.pathname === '/agent/execute' && request.method === 'POST') {
      const { agentType, message, sessionId } = await request.json<{
        agentType: string;
        message: string | UserInputMessage[];
        sessionId: string;
      }>();
      const agent = registry.get(agentType);

      const handle = await executor.execute(agent, { message }, { sessionId });

      return Response.json({
        sessionId: handle.sessionId,
        streamUrl: `/agent/stream/${handle.sessionId}`,
      });
    }

    // GET /agent/stream/:sessionId - SSE stream
    if (url.pathname.startsWith('/agent/stream/')) {
      const sessionId = url.pathname.split('/').pop();
      const handle = await executor.getHandle(registry.get('default'), sessionId);

      if (!handle) {
        return new Response('Not found', { status: 404 });
      }

      const stream = await handle.stream();
      return new Response(
        new ReadableStream({
          async start(controller) {
            for await (const chunk of stream) {
              controller.enqueue(`data: ${JSON.stringify(chunk)}\n\n`);
            }
            controller.close();
          },
        }),
        {
          headers: {
            'Content-Type': 'text/event-stream',
            'Cache-Control': 'no-cache',
          },
        }
      );
    }

    // GET /agent/result/:sessionId - Get result
    if (url.pathname.startsWith('/agent/result/')) {
      const sessionId = url.pathname.split('/').pop();
      const handle = await executor.getHandle(registry.get('default'), sessionId);

      if (!handle) {
        return new Response('Not found', { status: 404 });
      }

      const result = await handle.result();
      return Response.json(result);
    }

    return new Response('Not found', { status: 404 });
  },
};

Agent Registry

Register agents so the workflow can instantiate them:

typescript
// src/registry.ts
import { AgentRegistry } from '@helix-agents/runtime-cloudflare';
import { ResearchAgent, AnalyzerAgent } from './agents';

export const registry = new AgentRegistry();
registry.register(ResearchAgent);
registry.register(AnalyzerAgent);

Executor API

Creating the Executor

typescript
import { CloudflareAgentExecutor } from '@helix-agents/runtime-cloudflare';

const executor = new CloudflareAgentExecutor({
  workflowBinding: env.AGENT_WORKFLOW, // From wrangler.toml
  stateStore: d1StateStore,
  streamManager: doStreamManager,
});

Executing Agents

typescript
// With a string message
const handle = await executor.execute(
  MyAgent,
  { message: 'Research quantum computing' },
  {
    sessionId: 'custom-session-id', // Optional
  }
);

// Or with multiple messages (for context injection, file attachments, etc.)
const handle = await executor.execute(
  MyAgent,
  {
    message: [
      { role: 'user', content: 'Focus on recent breakthroughs', metadata: { source: 'system' } },
      { role: 'user', content: 'Research quantum computing' },
    ],
  },
  { sessionId: 'custom-session-id' }
);

Getting Handles

typescript
const handle = await executor.getHandle(MyAgent, sessionId);

if (handle) {
  const result = await handle.result();
  console.log(result);
}

Multi-Turn Conversations

Multi-turn via execute(sessionId) is not yet supported on Cloudflare Workflows

A SECOND executor.execute(agent, msg, { sessionId }) call against an already-completed session is blocked on the Workflows runtime by Cloudflare's write-once workflow-instance-id model: turn 1 owns agent__<name>__<sessionId> and the instance id cannot be recreated after the run closes. Multi-turn support on this runtime needs a per-turn instance id (mirroring the __resume__N / __retry__N convention) plus getCurrentRun-based handle reconstruction — tracked in GitLab #109.

In the meantime, on Cloudflare Workflows you can:

  • use Cloudflare Durable Objects instead (runtime-cloudflare DO path) — multi-turn execute(sessionId) works there;
  • continue a paused/HITL run with executor.resume() / submitToolResult() — these work unchanged;
  • pre-load prior conversation history into a fresh sessionId via the messages input (the "Direct Messages" pattern below).

The HTTP path most consumers actually use (POST /chat → ai-sdk handle-chat-stream) is unaffected when the underlying runtime is JS / Temporal / DBOS / CF-DO; this caveat applies specifically to CFW Workflows.

On the JS, DBOS, Temporal, and Cloudflare Durable Object runtimes the session-centric multi-turn API works as documented elsewhere:

Using sessionId (JS / DBOS / Temporal / CF-DO — NOT supported on CFW Workflows yet)

typescript
// First message - creates a new session
const handle1 = await executor.execute(agent, 'Hello, my name is Alice', {
  sessionId: 'session-123',
});
await handle1.result();

// Continue the conversation - same sessionId
const handle2 = await executor.execute(agent, 'What is my name?', {
  sessionId: 'session-123',
});

Using handle.send() (same support matrix as above)

typescript
const handle1 = await executor.execute(agent, 'Hello', {
  sessionId: 'session-123',
});
await handle1.result();

const handle2 = await handle1.send('Tell me more');

Using Direct Messages

typescript
const handle = await executor.execute(agent, {
  message: 'Continue from here',
  messages: myExternalMessageHistory,
});

Multi-Message Input

The message field accepts either a string or a UserInputMessage[] array. Multi-message input lets you inject context alongside the user's question or attach files:

typescript
const handle = await executor.execute(
  agent,
  {
    message: [
      {
        role: 'user',
        content: 'Background: user is on the enterprise plan',
        metadata: { source: 'system' },
      },
      { role: 'user', content: 'What features do I have access to?' },
    ],
  },
  { sessionId: 'session-123' }
);

Multi-message input also works with handle.send():

typescript
const handle2 = await handle1.send([
  { role: 'user', content: 'Additional context' },
  { role: 'user', content: 'Follow-up question' },
]);

String and multi-message inputs can be mixed freely across turns in the same session.

Behavior Table

InputMessages SourceState Source
message only (new session)Empty (fresh)Empty (fresh)
message + sessionId (existing)From sessionFrom session
message + messagesFrom messagesEmpty (fresh)
message + stateEmpty (fresh)From state
message + sessionId + messagesFrom messages (override)From session
message + sessionId + stateFrom sessionFrom state (override)
All fourFrom messages (override)From state (override)

Note: In the behavior table, message can be either a string or UserInputMessage[]. Both forms work identically with all combinations.

See JS Runtime - Multi-Turn Conversations for detailed documentation.

Streaming

Server-Sent Events

typescript
// Worker endpoint
if (url.pathname.startsWith('/stream/')) {
  const stream = await handle.stream();

  return new Response(
    new ReadableStream({
      async start(controller) {
        for await (const chunk of stream) {
          controller.enqueue(new TextEncoder().encode(`data: ${JSON.stringify(chunk)}\n\n`));
        }
        controller.close();
      },
    }),
    {
      headers: {
        'Content-Type': 'text/event-stream',
        'Cache-Control': 'no-cache',
        Connection: 'keep-alive',
      },
    }
  );
}

WebSocket (via Durable Objects)

typescript
// Client connects to DO WebSocket
const ws = new WebSocket(`wss://your-worker.workers.dev/ws/${streamId}`);

ws.onmessage = (event) => {
  const chunk = JSON.parse(event.data);
  handleChunk(chunk);
};

Sub-Agent Handling

Sub-agents execute as nested workflow calls:

typescript
// In workflow step
const subAgentResult = await step.do('sub-agent-call', async () => {
  // Start child workflow
  const instance = await this.env.AGENT_WORKFLOW.create({
    id: `agent__${subAgentType}__${subSessionId}`,
    params: {
      agentType: subAgentType,
      sessionId: subSessionId,
      streamId: parentStreamId, // Same stream
      message: inputMessage,
      parentSessionId: parentSessionId,
    },
  });

  // Wait for completion
  return pollUntilComplete(instance);
});

Remote Sub-Agent Handling

Remote sub-agents are executed via a dedicated executeRemoteSubAgentCall step, separate from both regular tool calls and local sub-agent workflow spawning.

The step provides:

  • Deterministic session IDs{parentSessionId}-remote-{toolCallId} for idempotent restarts
  • Crash recovery — On step retry, checks transport.getStatus() to avoid re-executing completed agents
  • Stream proxying — Remote agent chunks flow through the parent's stream
  • SubSessionRef tracking — Registers with remote: { streamId, lastSequence } metadata
  • Interrupt propagation — Abort-check interval propagates interrupts to the remote call
  • Timeout enforcement — Configurable per-tool timeout

See Remote Agents Guide for setup and configuration.

Persistent Sub-Agent Handling

Persistent sub-agents in the Cloudflare Workflows runtime are managed through companion tools and execute as nested workflow instances:

  • Blocking spawn: Creates a child workflow instance and waits for completion within a workflow step.
  • Non-blocking spawn: Creates a child workflow instance and continues without waiting.
  • State tracking: The D1StateStore V4 migration adds mode and name columns to __agents_sub_session_refs for persistent child tracking.

Companion tool results flow through the workflow step context. Persistent children use the same SubSessionRef tracking as ephemeral sub-agents but with mode: 'persistent'.

Re-spawning a completed persistent child continues it on its preserved session (memory retained) rather than recreating it — see Re-consulting a persistent companion (the critic loop).

Stateless Suspension Model (v7)

The Workflows runtime in v7 implements a fully stateless suspension model: the workflow body returns early at every Human-In-The-Loop (HITL) boundary instead of polling or holding the workflow open. This eliminates v6's billable wall-time during HITL waits (typical reduction of ~80% on multi-minute approvals or client-tool calls).

Suspension Boundaries

runAgentWorkflow returns one of three 'suspended_*' statuses on AgentWorkflowResult when execution hits a HITL boundary:

StatusMeaning
'suspended_client_tool'One or more client-executed tools are pending; awaiting submitToolResult
'suspended_awaiting_children'Sub-agents have suspended; parent waits for child resume cascades
'suspended_step_partial'Mixed server+client tool batch; phase-1 settled, phase-2 (finishWith) blocked

These are exported as SUSPENDED_WORKFLOW_RESULT_STATUSES (a tuple) and SuspendedWorkflowResultStatus (the union type) from @helix-agents/runtime-cloudflare for use in route handlers and discriminator logic.

Workflow Input mode

The AgentWorkflowInput carries a mode: 'fresh' | 'resume' discriminator:

  • 'fresh' (default): brand-new workflow instance; skip the resume drain, proceed to init / execute.
  • 'resume': drain submitted/expired pending client-tool entries via the applyResultsAndReload activity BEFORE iterating, then continue.

The __resume-N workflow ID convention names each successive resume instance ({originalId}__resume-1, {originalId}__resume-2, ...).

Resume Activities

Two activities (executed via step.do) underpin the protocol:

  • applyResultsAndReload — At the top of a 'resume' workflow, drains any pending client-tool submissions or expired-pending sweeps from durable state, applies them to the staged step, and re-loads SessionState. Returns the next stepId and the updated state.
  • commitSuspendedStep — Persists the suspension boundary atomically: writes suspendedStepId, suspendedAwaitingChildren, pendingClientToolCalls, and the new status into __agents_states.suspension_context in a single db.batch([...]) (D1) or transaction. Workflow then returns the 'suspended_*' result and exits.

Approval Gates and Client-Executed Tools

Agents declaring requireApproval: true on tools, or using defineClientExecutedTool, suspend the workflow on first encounter. After POST /chat/{id}/submit-tool-result lands the result in durable state, the executor starts a fresh mode: 'resume' workflow instance which drains the result and continues.

Persistent sub-agents are also fully supported on the Workflows path in v7 (parity with the DO runtime). Companion tools (spawnAgent, sendMessage, listChildren, getChildStatus, waitForResult, terminateChild) are auto-injected; child workflows are spawned as nested workflow instances and their 'suspended_awaiting_children' cascade propagates to the parent.

AgentWorkflowResult Status Enum

typescript
interface AgentWorkflowResult {
  sessionId: string;
  status:
    | 'completed'
    | 'failed'
    | 'aborted'
    | 'interrupted'
    | 'suspended_client_tool' // v7
    | 'suspended_awaiting_children' // v7
    | 'suspended_step_partial'; // v7
  output?: unknown;
  error?: string;
  errorDetail?: { message: string };
  suspended?: AgentWorkflowResultSuspendedField;
}

v7 Chat Handler Routes

When the Workflows runtime is wired through @helix-agents/agent-server, the host server exposes five new chat-handler routes that drive the suspend/resume flow:

RouteMethodPurpose
/chatPOSTUnified entry point: dispatch fresh / continue / resume / attach / completed retry.
/chat/{sessionId}/streamGETRe-attach to in-flight stream; reads resume position from headers.
/chat/{sessionId}/submit-tool-resultPOSTDurable submit for client-executed and approval-gated tools.
/chat/{sessionId}/interruptPOSTDurable interrupt; writes flag to state, returns 202 immediately.
/chat/{sessionId}/abortPOSTHard abort; writes terminal status.

See the @helix-agents/agent-server docs and the v6 → v7 migration guide for protocol details.

Workflow Steps

Cloudflare Workflows use steps for durability:

typescript
// Each step is durable - if worker restarts, execution continues
state = await step.do('step-1', async () => {
  // LLM call
  return await callLLM(messages, tools);
});

state = await step.do('step-2', async () => {
  // Tool execution
  return await executeTools(toolCalls);
});

Key points:

  • Steps are atomic and retried on failure
  • State between steps is persisted
  • Worker can restart between steps without data loss

Abort Handling

typescript
// Set abort flag in D1
await handle.abort('User cancelled');

// In workflow, check abort flag each step
const aborted = await step.do('check-abort', async () => {
  const row = await env.DB.prepare('SELECT aborted FROM __agents_states WHERE session_id = ?')
    .bind(sessionId)
    .first();
  return row?.aborted === 1;
});

if (aborted) {
  return { status: 'failed', error: 'Aborted' };
}

Interrupt Handling

Unlike abort (which is a hard stop), interrupt is a soft stop that saves state for later resumption:

typescript
// Interrupt the agent (soft stop)
await handle.interrupt('user_requested');

// Agent status becomes 'interrupted'
const state = await handle.getState();
console.log(state.status); // 'interrupted'

// Later, resume execution
const { canResume } = await handle.canResume();
if (canResume) {
  const newHandle = await handle.resume();
  const result = await newHandle.result();
}

How It Works

The Cloudflare runtime uses a dual approach for responsive interrupts:

  1. Interrupt flag - Set in D1 via stateStore.setInterruptFlag() for persistence
  2. Interrupt event - Sent via instance.sendEvent() for immediate wake-up

The workflow checks for interrupts at two points:

  • At each step boundary - Before starting a new LLM call
  • During sub-agent waits - Using Promise.race against interrupt events

Sub-Agent Interrupt Propagation

When an agent has running sub-agents, interrupts propagate through the entire hierarchy:

User calls handle.interrupt()


Parent receives interrupt (immediate via event)

    ├──► Child 1: interrupt flag set
    ├──► Child 2: interrupt flag set
    └──► Child 3: interrupt flag set


Each child stops at next safe point


Parent returns { status: 'interrupted' }

Target latency: < 200ms from interrupt request to stopped execution.

See Interrupt and Resume for complete documentation including resume modes and error handling.

Stream Resumption

The Workflows runtime supports stream resumption for handling client disconnections during agent execution.

Setting Up Stream Resumption

The v7 server-side pattern uses handleChatStream from @helix-agents/ai-sdk (typically wired via the chatHandler field on createAgentServer config when using the agent-server package). The handler reads resume position from Last-Event-ID / X-Resume-From-Sequence / X-Existing-Message-Id headers and serves the SSE response.

typescript
import { handleChatStream } from '@helix-agents/ai-sdk';
import { D1StateStore } from '@helix-agents/store-cloudflare';

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const url = new URL(request.url);

    if (url.pathname.startsWith('/api/chat/') && url.pathname.endsWith('/stream')) {
      return handleChatStream({
        request,
        executor, // CloudflareAgentExecutor
        stateStore: new D1StateStore({ database: env.DB }),
        streamManager: new DOStreamManager(env.STREAM_MANAGER),
        registry,
      });
    }

    return new Response('Not found', { status: 404 });
  },
};

Frontend Integration

The HelixChatTransport class has been replaced with the AI SDK v6 DefaultChatTransport plus prepareHelixChatRequest / prepareHelixReconnectRequest callbacks:

typescript
import { DefaultChatTransport } from 'ai';
import { prepareHelixChatRequest, prepareHelixReconnectRequest } from '@helix-agents/ai-sdk';
import { useChat } from '@ai-sdk/react';

function Chat({ sessionId, snapshot }) {
  const { messages } = useChat({
    id: sessionId,
    initialMessages: snapshot?.messages ?? [],
    resume: snapshot?.status === 'active',
    transport: new DefaultChatTransport({
      api: `/api/chat/${sessionId}`,
      prepareSendMessagesRequest: prepareHelixChatRequest,
      prepareReconnectToStreamRequest: prepareHelixReconnectRequest,
    }),
  });
}

Limitations

Subrequest Limit Impact

Stream resumption in the Workflows runtime is subject to the 1000 subrequest limit. Each stream chunk write to the Durable Object counts as a subrequest. For streaming-heavy agents that require frequent reconnections, consider the Durable Objects runtime which has unlimited streaming.

Key considerations:

  • Chunk storage - Stream chunks are stored in the Durable Object via DOStreamManager
  • Sequence tracking - Each chunk has a sequence number for resumption
  • TTL management - Configure appropriate chunk retention based on session duration

Deployment

Development

bash
wrangler dev

Production

bash
wrangler deploy

Secrets

bash
wrangler secret put OPENAI_API_KEY

Access in code:

typescript
const apiKey = env.OPENAI_API_KEY;

D1 State Store

The Workflows runtime uses D1 for state:

typescript
import { D1StateStore } from '@helix-agents/store-cloudflare';

const stateStore = new D1StateStore({ database: env.DB });

// Save state
await stateStore.saveState(sessionId, state);

// Load state
const state = await stateStore.loadState(sessionId);

// Append messages
await stateStore.appendMessages(sessionId, messages);

Limitations

Workspaces not supported

The Cloudflare Workflows runtime does NOT wire workspace providers. Registering an agent that declares a workspace on this runtime fails fast at run start with an error of the form:

CloudFlare Workflows runtime does not support workspaces (yet — design-reserved).
Agent '<agentType>' declares a workspace.
Use one of: Cloudflare Durable Object runtime via @helix-agents/agent-server.

If your agent needs files, shell, code, or snapshots, run it on the Cloudflare DO runtime via @helix-agents/agent-server + createAgentServer (which DOES wire the workspace — see the workspaces overview for provider options). Workspace support on the Workflows runtime is design-reserved for a future plan.

Subrequest Limit

Important

Cloudflare Workers have a 1000 subrequest limit per invocation. Each stream chunk write to the Durable Object counts as a subrequest. For streaming-heavy agents, consider the Durable Objects runtime which bypasses this limit.

Workflow Duration

Cloudflare Workflows have time limits:

  • Individual steps: 15 minutes
  • Total workflow: varies by plan

For longer agents, implement checkpointing.

Cold Starts

Edge workers may have cold starts. Minimize initialization code.

D1 Limitations

  • Single region (replication to read replicas)
  • Write throughput limits
  • No full-text search

Durable Object Limits

  • Single instance per ID
  • Memory limits per instance
  • Geographic pinning

Best Practices

1. Efficient Steps

Group related operations in single steps:

typescript
// Good: One step for LLM + response processing
await step.do('llm-step', async () => {
  const response = await callLLM(...);
  const parsed = processResponse(response);
  await saveToD1(parsed);
  return parsed;
});

// Avoid: Separate steps for each operation (more overhead)

2. Handle Rate Limits

Cloudflare has request limits. Implement backoff:

typescript
async function withRetry<T>(fn: () => Promise<T>): Promise<T> {
  for (let attempt = 0; attempt < 3; attempt++) {
    try {
      return await fn();
    } catch (error) {
      if (error.message.includes('rate limit')) {
        await sleep(Math.pow(2, attempt) * 1000);
        continue;
      }
      throw error;
    }
  }
  throw new Error('Max retries exceeded');
}

3. Optimize D1 Queries

Use indexes and prepared statements:

typescript
// Good: Prepared statement with index
const state = await env.DB.prepare('SELECT * FROM __agents_states WHERE session_id = ?')
  .bind(sessionId)
  .first();

// Avoid: String interpolation, full scans

4. Stream Efficiently

Buffer small chunks before sending:

typescript
const buffer: StreamChunk[] = [];
const BATCH_SIZE = 10;

async function flushBuffer() {
  if (buffer.length > 0) {
    await streamManager.writeBatch(streamId, buffer);
    buffer.length = 0;
  }
}

for await (const chunk of stream) {
  buffer.push(chunk);
  if (buffer.length >= BATCH_SIZE) {
    await flushBuffer();
  }
}
await flushBuffer();

Next Steps

Released under the MIT License.