Temporal Runtime

The Temporal runtime (@helix-agents/runtime-temporal) executes agents as durable Temporal workflows. This provides crash recovery, automatic retries, and production-grade reliability for long-running agent tasks.

When to Use

Good fit:

Production workloads requiring reliability
Long-running agents (hours or days)
Agents that must survive process restarts
Complex multi-agent orchestrations
Operations requiring audit trails and observability

Not ideal for:

Quick development iteration (infrastructure overhead)
Simple, short-lived agents
Cost-sensitive deployments without existing Temporal infrastructure

Prerequisites

You need a running Temporal server:

Option 1: Temporal Cloud (Recommended for production)

bash

# Sign up at https://temporal.io/cloud

Option 2: Local development

bash

# Using Docker
docker run -d --name temporal \
  -p 7233:7233 -p 8233:8233 \
  temporalio/auto-setup:latest

# Or using Temporal CLI
temporal server start-dev

Installation

bash

npm install @helix-agents/runtime-temporal @helix-agents/store-redis @temporalio/client @temporalio/worker

Architecture

mermaid

graph TB
    subgraph App ["Your Application"]
        Executor["<b>TemporalAgentExecutor</b><br/>Starts workflows · Returns handles<br/>Reads from StateStore/StreamManager"]
    end

    subgraph Server ["Temporal Server"]
        ServerDesc["Persists workflow state<br/>Manages task queues<br/>Handles retries and timeouts"]
    end

    subgraph Worker ["Temporal Worker"]
        subgraph Workflow ["Agent Workflow"]
            WF["Orchestrates execution<br/>Calls activities for LLM/tools"]
        end
        subgraph Activities ["Activities"]
            Act["LLM calls · Tool execution<br/>State persistence"]
        end
    end

    App --> Server --> Worker

Setup Guide

1. Create the Workflow

Define a workflow that wraps the agent execution:

typescript

// src/workflows/agent-workflow.ts
import { proxyActivities, defineSignal, setHandler } from '@temporalio/workflow';
import type { AgentWorkflowInput, AgentWorkflowResult } from '@helix-agents/runtime-temporal';
import type * as activities from '../activities';

// Proxy activities with timeouts
const { executeAgentStep, saveState, loadState } = proxyActivities<typeof activities>({
  startToCloseTimeout: '5 minutes',
  retry: {
    maximumAttempts: 3,
    backoffCoefficient: 2,
  },
});

// Abort signal
export const abortSignal = defineSignal('abort');

export async function agentWorkflow(input: AgentWorkflowInput): Promise<AgentWorkflowResult> {
  let aborted = false;
  setHandler(abortSignal, () => {
    aborted = true;
  });

  try {
    // Load or initialize state
    let state = await loadState(input.sessionId);
    if (!state) {
      state = await initializeState(input);
    }

    // Main execution loop
    while (state.status === 'running' && !aborted) {
      const stepResult = await executeAgentStep(input.agentType, state);
      state = await processStepResult(state, stepResult);
      await saveState(state);
    }

    if (aborted) {
      return { status: 'failed', error: 'Workflow aborted' };
    }

    return {
      status: state.status === 'completed' ? 'completed' : 'failed',
      output: state.output,
      error: state.error,
    };
  } catch (error) {
    return {
      status: 'failed',
      error: error instanceof Error ? error.message : String(error),
    };
  }
}

2. Create Activities

Activities perform the actual work (LLM calls, tool execution):

typescript

// src/activities/agent-activities.ts
import { AgentRegistry, type AgentState } from '@helix-agents/runtime-temporal';
import { VercelAIAdapter } from '@helix-agents/llm-vercel';
import { RedisStateStore, RedisStreamManager } from '@helix-agents/store-redis';

const stateStore = new RedisStateStore(redisClient);
const streamManager = new RedisStreamManager(redisClient);
const llmAdapter = new VercelAIAdapter();
const registry = new AgentRegistry();

// Register your agents
registry.register(ResearchAgent);
registry.register(AnalyzerAgent);

export async function loadState(sessionId: string): Promise<AgentState<unknown, unknown> | null> {
  return stateStore.loadState(sessionId);
}

export async function saveState(
  sessionId: string,
  state: AgentState<unknown, unknown>
): Promise<void> {
  await stateStore.saveState(sessionId, state);
}

export async function executeAgentStep(
  agentType: string,
  state: AgentState<unknown, unknown>
): Promise<StepResult<unknown>> {
  const agent = registry.get(agentType);
  if (!agent) {
    throw new Error(`Unknown agent type: ${agentType}`);
  }

  // Execute one LLM step
  return executeStep(agent, state, llmAdapter, streamManager);
}

3. Create the Worker

The worker processes workflows and activities:

typescript

// src/worker.ts
import { Worker, NativeConnection } from '@temporalio/worker';
import * as activities from './activities';

async function runWorker() {
  const connection = await NativeConnection.connect({
    address: process.env.TEMPORAL_ADDRESS ?? 'localhost:7233',
  });

  const worker = await Worker.create({
    connection,
    namespace: 'default',
    taskQueue: 'agent-tasks',
    workflowsPath: require.resolve('./workflows'),
    activities,
  });

  await worker.run();
}

runWorker().catch(console.error);

4. Create the Executor

The executor starts workflows and returns handles:

typescript

// src/executor.ts
import { Client, Connection } from '@temporalio/client';
import { TemporalAgentExecutor } from '@helix-agents/runtime-temporal';
import { RedisStateStore, RedisStreamManager } from '@helix-agents/store-redis';

async function createExecutor() {
  const connection = await Connection.connect({
    address: process.env.TEMPORAL_ADDRESS ?? 'localhost:7233',
  });

  const client = new Client({ connection });

  // Wrap Temporal client to match interface
  const temporalClientAdapter = {
    startWorkflow: async (name, options) => {
      const handle = await client.workflow.start(name, {
        workflowId: options.workflowId,
        taskQueue: options.taskQueue,
        args: options.args,
      });
      return wrapHandle(handle);
    },
    getHandle: (workflowId) => {
      return wrapHandle(client.workflow.getHandle(workflowId));
    },
  };

  return new TemporalAgentExecutor({
    client: temporalClientAdapter,
    stateStore: new RedisStateStore(redis),
    streamManager: new RedisStreamManager(redis),
    workflowName: 'agentWorkflow',
    taskQueue: 'agent-tasks',
  });
}

Using the Executor

Once set up, usage is identical to other runtimes:

typescript

const executor = await createExecutor();

// Execute agent with a string message
const handle = await executor.execute(ResearchAgent, 'Research quantum computing');

// Or with multiple messages (for context injection, file attachments, etc.)
// const handle = await executor.execute(ResearchAgent, {
//   message: [
//     { role: 'user', content: 'Focus on recent breakthroughs', metadata: { source: 'system' } },
//     { role: 'user', content: 'Research quantum computing' },
//   ],
// });

// Stream events
const stream = await handle.stream();
for await (const chunk of stream) {
  console.log(chunk);
}

// Get result
const result = await handle.result();

Multi-Turn Conversations

The Temporal runtime supports the same multi-turn conversation API as the JS runtime using the session-centric model:

Using `sessionId`

typescript

// First message - creates a new session
const handle1 = await executor.execute(agent, 'Hello, my name is Alice', {
  sessionId: 'session-123',
});
await handle1.result();

// Continue the conversation - same sessionId
const handle2 = await executor.execute(agent, 'What is my name?', {
  sessionId: 'session-123',
});

Using `handle.send()`

typescript

const handle1 = await executor.execute(agent, 'Hello', {
  sessionId: 'session-123',
});
await handle1.result();

const handle2 = await handle1.send('Tell me more');

Using Direct Messages

typescript

const handle = await executor.execute(agent, {
  message: 'Continue from here',
  messages: myExternalMessageHistory,
});

Multi-Message Input

The message field in all operations accepts either a string or a UserInputMessage[] array. Multi-message input lets you inject context alongside the user's question, attach files, or send multiple messages in a single execution:

typescript

const handle = await executor.execute(
  agent,
  {
    message: [
      {
        role: 'user',
        content: 'Background: user is on the enterprise plan',
        metadata: { source: 'system' },
      },
      { role: 'user', content: 'What features do I have access to?' },
    ],
  },
  { sessionId: 'session-123' }
);

// Messages can include file attachments
const handle = await executor.execute(
  agent,
  {
    message: [
      {
        role: 'user',
        content: 'Analyze this image',
        files: [
          { data: 'data:image/png;base64,...', mediaType: 'image/png', filename: 'chart.png' },
        ],
      },
    ],
  },
  { sessionId: 'session-123' }
);

Multi-message input also works with handle.send() and resume with_message mode:

typescript

const handle2 = await handle1.send([
  { role: 'user', content: 'Additional system context' },
  { role: 'user', content: 'Follow-up question' },
]);

const resumed = await handle.resume({
  mode: 'with_message',
  message: [{ role: 'user', content: 'Here is what I meant instead' }],
});

String and multi-message inputs can be mixed freely across turns in the same session.

Behavior Table

Input	Messages Source	State Source
`message` only (new session)	Empty (fresh)	Empty (fresh)
`message` + `sessionId` (existing)	From session	From session
`message` + `messages`	From `messages`	Empty (fresh)
`message` + `state`	Empty (fresh)	From `state`
`message` + `sessionId` + `messages`	From `messages` (override)	From session
`message` + `sessionId` + `state`	From session	From `state` (override)
All four	From `messages` (override)	From `state` (override)

Note: In the behavior table, message can be either a string or UserInputMessage[]. Both forms work identically with all combinations.

Caller-provided history: prefix: The agent-input history: prefix is honored on fresh sessions — the executor pre-claim no longer drops it, so the seeded history lands in the message log and onMessage fires with the correct messageIndex. On continuation turns (an existing session with a non-empty durable message log) the prefix is dropped — this is now the uniform behavior across all runtimes (history: is fresh-session seeding only). The drop is observable via a temporal.history_dropped_on_continuation warn. The earlier runtime-cloudflare divergence (CF appended the prefix on continuations, doubling the durable log) was resolved by gating every runtime on an empty durable log, per FU-CROSS-RUNTIME-HISTORY-CONTINUATION-DOUBLING.

See JS Runtime - Multi-Turn Conversations for detailed documentation.

Execute vs Resume Semantics

The Temporal runtime provides two methods for running agents with different semantics:

Method	Use Case	Stream Behavior	Session Behavior
`execute()`	Start new conversation or add new message	Creates new stream segment	Creates or continues session
`resume()`	Continue interrupted session	Continues existing stream	Must have interrupted state

execute() is for normal agent interactions:

Starts a new Temporal workflow (or continues an existing session)
Creates a fresh stream segment for the response
Appends the new user message to the session

typescript

// New conversation
const handle = await executor.execute(agent, 'Hello', { sessionId: 'session-1' });

// Follow-up in same session (new run, new stream segment)
const handle2 = await executor.execute(agent, 'Tell me more', { sessionId: 'session-1' });

resume() is specifically for recovering from interrupts or HITL suspensions:

Starts a NEW Temporal workflow instance with the __resume-N suffix (the prior workflow has already exited at the interrupt or HITL boundary)
Maintains stream continuity for clients reconnecting (the new workflow emits a run_resumed chunk so the SSE client can dedupe)
Session must be in interrupted, paused, or HITL-suspended status (pendingClientToolCalls map non-empty)

typescript

// After an interrupt...
const state = await stateStore.loadState('session-1');
if (state?.status === 'interrupted') {
  // Resume starts a new workflow that continues the same stream
  const handle = await executor.resume(agent, 'session-1');
}

Resume conventions (post-A.2)

The Temporal runtime exits the workflow on every HITL boundary (client-executed tool, approval gate, sub-agent suspension). executor.resume(sessionId) starts a NEW workflow instance with a distinctive workflow ID:

${workflowIdPrefix}__${agentType}__${sessionId}__resume-${N}

Single-dash separator (__resume-N, not __resume__N) — distinct from retry()'s __retry__N convention.
N increments via stateStore.incrementResumeCount() (atomic).
Uses WorkflowIdReusePolicy.ALLOW_DUPLICATE so subsequent resumes don't conflict with completed prior workflows.

The new workflow's mode='resume' branch calls activities.applyResultsAndReload, which:

Drains submitted client-tool results into the message log.
Fires onMessage + afterTool hooks for each drained entry.
Synthesizes client_tool_timeout results for expired deadlines.
Drains completed sub-agent children via recordSubSessionResult.
Returns { state, drainedCount, hasMoreSuspended, childrenToRespawn } so the workflow body can decide whether to continue or re-suspend.

AgentResult.status for HITL agents resolves with one of:

'suspended_client_tool' — at least one client-executed tool is awaiting submission.
'suspended_awaiting_children' — at least one sub-agent child is in 'suspended_*' and the parent is awaiting it.
'suspended_step_partial' — mixed phase-1 step where some tools completed and others suspended.

result.suspended carries the routing info ({ toolCallIds?, children?, stepId? }) needed to drive the next submitToolResult + resume cycle.

submitToolResult({ sessionId, toolCallId, result | error }) is a durable write only — no Temporal signal is sent. The workflow has already exited at the HITL boundary; consumers drive the next loop iteration via executor.resume() (or via the framework's handleChatStream plumbing on the server side).

Agent Registry

typescript

import { AgentRegistry } from '@helix-agents/runtime-temporal';

const registry = new AgentRegistry();

// Register each agent type
registry.register(ResearchAgent); // name: 'researcher'
registry.register(AnalyzerAgent); // name: 'analyzer'
registry.register(SummarizerAgent); // name: 'summarizer'

// In activities, look up by type
export async function executeAgentStep(agentType: string, state) {
  const agent = registry.get(agentType); // Returns the agent config
  // ...
}

AgentRegistry with Factories

The AgentRegistry supports both static registration and factory functions for dynamic agent creation. Factories are useful when agents need runtime dependencies like API keys, database connections, or per-request configuration.

Static vs Factory Registration

typescript

import { AgentRegistry, AgentNotFoundError } from '@helix-agents/runtime-temporal';

const registry = new AgentRegistry();

// Static registration - for agents with no runtime dependencies
registry.register(ResearchAgent);
registry.register(SummarizerAgent);

// Factory registration - for agents needing runtime dependencies
interface Env {
  TAVILY_API_KEY: string;
  DATABASE: Database;
}

registry.registerFactory<Env>('dynamic-researcher', (ctx) => {
  return createResearchAgent({
    apiKey: ctx.env.TAVILY_API_KEY,
    database: ctx.env.DATABASE,
    userId: ctx.userId,
    sessionId: ctx.sessionId,
  });
});

AgentFactoryContext

Factory functions receive context about the current execution:

typescript

interface AgentFactoryContext<TEnv = unknown> {
  /** Environment bindings (e.g., API keys, connections) */
  env: TEnv;
  /** Current session identifier */
  sessionId: string;
  /** Current run identifier */
  runId: string;
  /** Optional user identifier */
  userId?: string;
}

Using resolve() in Activities

The resolve() method handles both static and factory agents. Use it instead of get() when you have factories:

typescript

// src/activities.ts
export async function executeAgentStep(
  agentType: string,
  state: AgentState<unknown, unknown>,
  context: { sessionId: string; runId: string; userId?: string }
): Promise<StepResult<unknown>> {
  // resolve() handles both static and factory agents
  const agent = registry.resolve(agentType, {
    env, // Your environment object
    sessionId: context.sessionId,
    runId: context.runId,
    userId: context.userId,
  });

  return executeStep(agent, state, llmAdapter, streamManager);
}

Error Handling with AgentNotFoundError

When an agent type is not found, AgentNotFoundError provides helpful information:

typescript

import { AgentNotFoundError } from '@helix-agents/runtime-temporal';

try {
  const agent = registry.resolve('unknown-agent', context);
} catch (error) {
  if (error instanceof AgentNotFoundError) {
    console.error(`Unknown agent type: ${error.agentType}`);
    console.error(`Available types: ${error.availableTypes.join(', ')}`);
  }
}

Registry Methods

Method	Description
`register(config)`	Register a static agent configuration
`registerFactory(type, factory)`	Register a factory function
`replace(config)`	Overwrite an existing static registration
`resolve(type, context)`	Resolve agent (factory-first, then static)
`get(type)`	Get static agent only (legacy)
`has(type)`	Check if agent type is registered
`unregister(type)`	Remove an agent registration
`getRegisteredTypes()`	Get all registered agent type names
`clear()`	Remove all registrations (for testing)

`registry.replace(config)`

Overwrites an existing static agent registration with a new config, returning whether a prior registration existed. Throws if a factory is registered under the same name.

typescript

const registry = new AgentRegistry();
registry.register(myAgent);

// Later: swap the registered reference (e.g., to attach hooks)
const agentWithHooks = { ...myAgent, hooks: makeHooks() };
const hadPrior = registry.replace(agentWithHooks);
// hadPrior === true

When to use: The Temporal runtime resolves agents BY NAME from the registry — workflow inputs only carry the agent name, not the agent reference. Per-call hooks attached to the agent passed into executor.execute(agent, ...) aren't honored unless the registered reference itself carries them. replace() is the canonical way to swap the reference (most commonly in tests that exercise per-test hooks). JS / DBOS runtimes read the agent reference inline at execute() time and don't need this API.

Don't replace factories. If a factory is registered under the same name, replace() throws. Use unregister() followed by registerFactory() to replace a factory.

Sub-Agent Handling

Sub-agents execute as child workflows:

typescript

// In workflow
import { executeChild } from '@temporalio/workflow';

// When parent needs to execute sub-agent
const subAgentResult = await executeChild('agentWorkflow', {
  args: [
    {
      agentType: 'analyzer',
      sessionId: `${parentSessionId}-sub-${callId}`,
      streamId: parentStreamId, // Same stream for unified streaming
      message: inputMessage,
      parentSessionId,
    },
  ],
  workflowId: `agent__analyzer__${subSessionId}`,
  taskQueue: 'agent-tasks',
});

Benefits of child workflows:

Independent retry policies
Separate timeouts
Can be cancelled independently
Full workflow history preserved

Remote Sub-Agent Handling

Remote sub-agents are executed via a dedicated executeRemoteSubAgentCall activity, separate from both regular tool calls and local sub-agent child workflows.

The activity provides:

Deterministic session IDs — {parentSessionId}-remote-{toolCallId} for idempotent restarts
Crash recovery — On activity retry, checks transport.getStatus() to avoid re-executing completed agents
Heartbeat-based reconnection — Heartbeats store lastSequence so retries can resume the SSE stream mid-flight
Stream proxying — Remote agent chunks flow through the parent's stream
SubSessionRef tracking — Registers with remote: { streamId, lastSequence } metadata
Interrupt propagation — Racing pattern propagates interrupts to abort the remote call

See Remote Agents Guide for setup and configuration.

Persistent Sub-Agent Handling

Persistent sub-agents in the Temporal runtime execute as child workflows managed through companion tools:

Blocking spawn: Uses executeChild() to run the child as a Temporal child workflow. The parent workflow waits for the child to complete. After completion, the workflow calls markPersistentChildStatus to update the SubSessionRef.
Non-blocking spawn: Starts a child workflow with parentClosePolicy: 'ABANDON'. The parent continues immediately.
Companion tool execution: All companion tool calls route through a dedicated executeCompanionToolCall activity, separate from regular tool execution.

Unlike the JS runtime, the Temporal runtime does not store companion tool results as separate ToolResultMessage entries in the state store. Results flow through the workflow execution context.

Persistent children are tracked via SubSessionRef entries with mode: 'persistent' and are cleaned up when the parent workflow completes.

Re-spawning a completed persistent child continues it on its preserved session (memory retained) rather than recreating it — see Re-consulting a persistent companion (the critic loop).

Activity Configuration

Configure timeouts and retries per activity:

typescript

const { executeAgentStep } = proxyActivities<typeof activities>({
  // How long the activity can run
  startToCloseTimeout: '10 minutes',

  // How long to wait for worker to start processing
  scheduleToStartTimeout: '1 minute',

  // Heartbeat timeout for long activities
  heartbeatTimeout: '30 seconds',

  // Retry configuration
  retry: {
    initialInterval: '1 second',
    backoffCoefficient: 2,
    maximumInterval: '1 minute',
    maximumAttempts: 5,
    nonRetryableErrorTypes: ['InvalidAgentError'],
  },
});

Error Classification

When LLM calls fail during Temporal activities, errors are classified using the same HelixError system as other runtimes. The key difference is that error classification must cross the Temporal serialization boundary:

Activity -- executeAgentStep detects HelixError and extracts errorCode into the step result DTO
Workflow -- Passes errorCode to failAgentStream when the agent fails
Stream -- failAgentStream writes an ErrorChunk with the code field set

typescript

// Step result includes errorCode for Temporal serialization
interface AgentStepResult {
  error?: string;
  errorCode?: string; // e.g., 'provider_overloaded'
  // ... other fields
}

// Failure input includes errorCode for stream error chunks
interface FailAgentStreamInput {
  sessionId: string;
  error: string;
  errorCode?: string; // Propagated from step result
}

See Error Handling Guide for the complete error pipeline.

Crash Recovery

Temporal provides automatic crash recovery:

mermaid

graph TB
    W1["Worker 1 starts workflow"]
    W1 --> S1["Step 1 completes, state saved"]
    S1 --> S2["Step 2 completes, state saved"]
    S2 --> Crash["Worker 1 crashes"]

    Crash --> Detect["Temporal detects failure"]
    Detect --> Resched["Workflow task rescheduled"]

    Resched --> W2["Worker 2 picks up"]
    W2 --> Replay["Replays history (deterministic)"]
    Replay --> Continue["Continues from Step 3"]
    Continue --> Complete["Completes normally"]

Key points:

Workflow code must be deterministic
State is reconstructed from event history
Activities are not re-executed (results cached)

Interrupt and Resume

The Temporal runtime supports soft interruption with the ability to resume later:

typescript

// Interrupt the agent (soft stop)
await handle.interrupt('user_requested');

// Agent status becomes 'interrupted'
const state = await handle.getState();
console.log(state.status); // 'interrupted'

// Later, resume execution
const { canResume } = await handle.canResume();
if (canResume) {
  const newHandle = await handle.resume();
  const result = await newHandle.result();
}

Workflow Setup (v7)

The v7 stateless workflow exits at every HITL boundary and the executor spawns a fresh __resume-N workflow on each resume. Interrupt is durable — the INTERRUPT_SIGNAL_NAME handler writes a flag the running workflow observes at its next checkpoint. Register agentWorkflow directly, or re-export it under a stable name your TemporalAgentExecutor.workflowName expects:

typescript

import { agentWorkflow as v7AgentWorkflow } from '@helix-agents/runtime-temporal/workflow';
import type { AgentWorkflowInput, AgentWorkflowResult } from '@helix-agents/runtime-temporal';

// Re-export under your workflow name. The imported `agentWorkflow` sets up
// `proxyActivities<GenericActivities>`, the INTERRUPT_SIGNAL_NAME handler,
// and child-workflow dispatch internally. The wrapper exists only to
// register the function under a stable name.
export async function agentWorkflow(input: AgentWorkflowInput): Promise<AgentWorkflowResult> {
  return v7AgentWorkflow(input);
}

Removed v6 signals

v7's stateless suspension model removed four Temporal signals: submitToolResult, childSuspended, childWoke, runResumed. Code that called handle.signal('submitToolResult', ...) or used the RUN_RESUMED_SIGNAL_NAME constant must migrate to the v7 stateless contract — submission writes to durable state, and executor.resume() spawns a fresh __resume-N workflow that drains submissions via the applyResultsAndReload activity. The two-arg runAgentWorkflow(input, activities, options) signature is also gone; the v7 export is agentWorkflow(input). See the v6 → v7 upgrade guide. The INTERRUPT_SIGNAL_NAME constant is preserved for cross-process interrupt back-compat.


### Sub-Agent Interrupt Propagation

When an agent has running child workflows, interrupts propagate through the entire hierarchy:

1. **Parent receives interrupt** - Signal handler resolves the `interruptTrigger`
2. **Children are signaled** - Parent uses `getExternalWorkflowHandle` to signal each child
3. **Children stop gracefully** - Each child detects the interrupt and stops at its next safe point
4. **Parent completes** - Returns with status `interrupted`

**Target latency:** < 500ms from interrupt request to stopped execution.

See [Interrupt and Resume](/guide/interrupt-resume) for complete documentation including resume modes and error handling.

## Stream Resumption

The Temporal runtime supports stream resumption for handling client disconnections and page refreshes during agent execution. This uses the same `FrontendHandler` pattern as the JS runtime.

### Setting Up Stream Resumption

```typescript
import { createFrontendHandler } from '@helix-agents/ai-sdk';
import { RedisStreamManager, RedisStateStore } from '@helix-agents/store-redis';

const handler = createFrontendHandler({
  executor, // TemporalAgentExecutor
  stateStore, // RedisStateStore for snapshots
  streamManager, // RedisStreamManager for stream chunks
  agent: myAgent,
});

Frontend Integration

Use HelixChatTransport with resumeFromSequence to handle reconnections:

typescript

import { HelixChatTransport } from '@helix-agents/ai-sdk/client';
import { useChat } from '@ai-sdk/react';

function Chat({ sessionId, snapshot }) {
  const { messages } = useChat({
    id: sessionId,
    initialMessages: snapshot?.messages ?? [],
    resume: snapshot?.status === 'active',
    transport: new HelixChatTransport({
      api: `/api/chat/${sessionId}`,
      resumeFromSequence: snapshot?.streamSequence,
    }),
  });
}

Content Replay Flow

When a client reconnects mid-stream:

Load snapshot - Get messages and streamSequence from state store
Resume stream - Connect to stream manager with fromSequence
Content replay - Partial content is replayed as stream events
Live streaming - Continue receiving new chunks

The FrontendHandler.getSnapshot() method automatically handles content replay when fromSequence > 0:

typescript

// In your API route (e.g., /api/chat/[sessionId]/route.ts)
export async function GET(request: Request, { params }: { params: { sessionId: string } }) {
  const { sessionId } = params;
  const fromSequence = parseInt(request.headers.get('Last-Event-ID') ?? '0');

  // FrontendHandler builds replay content automatically
  return handler.handleStream(sessionId, { fromSequence });
}

Key Considerations

Redis required - Stream resumption relies on RedisStreamManager storing chunks with sequence numbers
Chunk retention - Configure appropriate TTL for stream chunks based on expected session duration
Multi-worker - Works across multiple workers since stream state is in Redis

Concurrency Protection

The Temporal runtime uses workflow ID uniqueness as its primary concurrency protection:

Each session maps to a unique workflow ID
Temporal server prevents duplicate workflow IDs
Concurrent execute() calls result in workflow collision errors

retry() Method

Retry failed agents in Temporal:

typescript

const result = await handle.result();
if (result.status === 'failed') {
  const retryHandle = await executor.retry(agent, sessionId);
  // or
  const retryHandle = await handle.retry();
}

The retry creates a new workflow execution while preserving session state.

Method Comparison

Method	Purpose	Stream Behavior	Valid From Status
`execute()`	New/continue conversation	Resets stream	Any except running
`resume()`	Continue after interrupt	Preserves stream	interrupted, paused
`retry()`	Recover from failure	Resets to checkpoint	failed

Determinism Requirements

Workflow code must be deterministic:

typescript

// BAD - Non-deterministic
export async function agentWorkflow(input) {
  const timestamp = Date.now(); // Different on replay!
  const random = Math.random(); // Different on replay!
  const uuid = crypto.randomUUID(); // Different on replay!
}

// GOOD - Use Temporal APIs
import { sleep, uuid4, workflowInfo } from '@temporalio/workflow';

export async function agentWorkflow(input) {
  const info = workflowInfo();
  const timestamp = info.startTime; // Deterministic
  const id = uuid4(); // Deterministic (seeded)
  await sleep('5 seconds'); // Deterministic timer
}

Move non-deterministic operations to activities:

LLM API calls
Database queries
External API calls
Random number generation

Observability

Temporal Web UI

Access at http://localhost:8233 (local) or via Temporal Cloud.

View:

Workflow history and events
Activity execution details
Pending/failed workflows
Search by workflow ID or type

Workflow Queries

Query running workflows:

typescript

// In workflow
import { defineQuery, setHandler } from '@temporalio/workflow';

export const getProgressQuery = defineQuery<{ stepCount: number; status: string }>('getProgress');

export async function agentWorkflow(input) {
  let progress = { stepCount: 0, status: 'running' };

  setHandler(getProgressQuery, () => progress);

  // Update progress during execution
  progress.stepCount++;
  // ...
}

// From client
const handle = client.workflow.getHandle(workflowId);
const progress = await handle.query(getProgressQuery);

Production Deployment

Worker Scaling

Run multiple workers for throughput:

bash

# Scale horizontally
docker-compose scale worker=5

Workers pull from the same task queue - Temporal handles distribution.

Temporal Cloud

For production, use Temporal Cloud:

typescript

import { Connection, Client } from '@temporalio/client';

const connection = await Connection.connect({
  address: 'your-namespace.tmprl.cloud:7233',
  tls: {
    clientCertPair: {
      crt: fs.readFileSync('client.pem'),
      key: fs.readFileSync('client.key'),
    },
  },
});

Monitoring

Set up metrics:

typescript

import { Runtime } from '@temporalio/worker';

Runtime.install({
  telemetryOptions: {
    metrics: {
      prometheus: { bindAddress: '0.0.0.0:9464' },
    },
  },
});

Limitations

Workspaces not supported

The Temporal runtime does NOT wire workspace providers. execute() / resume() / retry() of an agent that declares a workspace fail fast at run start with an error of the form:

Temporal runtime does not support workspaces (yet — design-reserved).
Agent '<agentType>' declares a workspace.
Use one of: JS runtime, Cloudflare Durable Object runtime via @helix-agents/agent-server.

The check (assertRuntimeSupportsWorkspaces, shared with the Cloudflare Workflows runtime) runs synchronously at the top of each entry point, before any state load or workflow start, so no side effects occur when it rejects. describeCapabilities() returns an empty workspaceProviderKinds array so AgentServer's executor-side check catches a workspace-declaring agent at construction time too.

If your agent needs files, shell, code, or snapshots, run it on the JS runtime or the Cloudflare DO runtime via @helix-agents/agent-server (both DO wire workspaces — see the workspaces overview for provider options). Workspace support on Temporal is design-reserved for a future plan.

Memory auto-injection/extraction not supported

The Temporal runtime does not wire any MemoryManager logic. The memory: config field on an agent definition is accepted at the type level but is a silent no-op — no memory injection into prompts and no extraction after tool calls. Agents that need live memory recall or storage must run on the JS or Cloudflare runtimes.

LLMConfig.cache strategies are applied on the Temporal runtime (unlike DBOS).

Higher Latency

Each activity invocation adds network overhead. Batch operations when possible.

Determinism Constraints

Workflow code restrictions can be challenging. Move all I/O to activities.

Infrastructure Overhead

Requires running Temporal server and workers alongside your application.

Learning Curve

Temporal concepts (workflows, activities, replay, determinism) require understanding.

Best Practices

1. Keep Workflows Thin

Workflow code should only orchestrate - move logic to activities:

typescript

// Workflow just coordinates
export async function agentWorkflow(input) {
  const state = await loadState(input.sessionId);
  const result = await executeStep(state); // Activity does the work
  await saveState(state);
}

2. Appropriate Timeouts

Set realistic timeouts:

typescript

const { executeAgentStep } = proxyActivities<typeof activities>({
  startToCloseTimeout: '5 minutes', // LLM calls can be slow
});

3. Heartbeat Long Activities

For activities > 30 seconds, implement heartbeating:

typescript

export async function executeLongTool(input: ToolInput): Promise<ToolResult> {
  for (const item of items) {
    await processItem(item);
    Context.current().heartbeat(); // Report progress
  }
  return result;
}

4. Use Continue-As-New for Long Histories

Workflows with many events should reset:

typescript

import { continueAsNew, workflowInfo } from '@temporalio/workflow';

export async function agentWorkflow(input) {
  const info = workflowInfo();

  // After many steps, continue as new to reset history
  if (info.historyLength > 10000) {
    await continueAsNew<typeof agentWorkflow>(input);
  }
  // ...
}

Next Steps

Interrupt and Resume - Complete interrupt/resume documentation
JavaScript Runtime - Simpler option for development
Cloudflare Runtime - Edge deployment alternative
Storage: Redis - Recommended store for Temporal

Temporal Runtime ​

When to Use ​

Prerequisites ​

Installation ​

Architecture ​

Setup Guide ​

1. Create the Workflow ​

2. Create Activities ​

3. Create the Worker ​

4. Create the Executor ​

Using the Executor ​

Multi-Turn Conversations ​

Using sessionId ​

Using handle.send() ​

Using Direct Messages ​

Multi-Message Input ​

Behavior Table ​

Execute vs Resume Semantics ​

Resume conventions (post-A.2) ​

Agent Registry ​

AgentRegistry with Factories ​

Static vs Factory Registration ​

AgentFactoryContext ​

Using resolve() in Activities ​

Error Handling with AgentNotFoundError ​

Registry Methods ​

registry.replace(config) ​

Sub-Agent Handling ​

Remote Sub-Agent Handling ​

Persistent Sub-Agent Handling ​

Activity Configuration ​

Error Classification ​

Crash Recovery ​

Interrupt and Resume ​

Workflow Setup (v7) ​

Frontend Integration ​

Content Replay Flow ​

Key Considerations ​

Concurrency Protection ​

retry() Method ​

Method Comparison ​

Determinism Requirements ​

Observability ​

Temporal Web UI ​

Workflow Queries ​

Production Deployment ​

Worker Scaling ​

Temporal Cloud ​

Monitoring ​

Limitations ​

Workspaces not supported ​

Memory auto-injection/extraction not supported ​

Higher Latency ​

Determinism Constraints ​

Infrastructure Overhead ​

Learning Curve ​

Best Practices ​

1. Keep Workflows Thin ​

2. Appropriate Timeouts ​

3. Heartbeat Long Activities ​

4. Use Continue-As-New for Long Histories ​

Next Steps ​

Temporal Runtime

When to Use

Prerequisites

Installation

Architecture

Setup Guide

1. Create the Workflow

2. Create Activities

3. Create the Worker

4. Create the Executor

Using the Executor

Multi-Turn Conversations

Using `sessionId`

Using `handle.send()`

Using Direct Messages

Multi-Message Input

Behavior Table

Execute vs Resume Semantics

Resume conventions (post-A.2)

Agent Registry

AgentRegistry with Factories

Static vs Factory Registration

AgentFactoryContext

Using resolve() in Activities

Error Handling with AgentNotFoundError

Registry Methods

`registry.replace(config)`

Sub-Agent Handling

Remote Sub-Agent Handling

Persistent Sub-Agent Handling

Activity Configuration

Error Classification

Crash Recovery

Interrupt and Resume

Workflow Setup (v7)

Frontend Integration

Content Replay Flow

Key Considerations

Concurrency Protection

retry() Method

Method Comparison

Determinism Requirements

Observability

Temporal Web UI

Workflow Queries

Production Deployment

Worker Scaling

Temporal Cloud

Monitoring

Limitations

Workspaces not supported

Memory auto-injection/extraction not supported

Higher Latency

Determinism Constraints

Infrastructure Overhead

Learning Curve

Best Practices

1. Keep Workflows Thin

2. Appropriate Timeouts

3. Heartbeat Long Activities

4. Use Continue-As-New for Long Histories

Next Steps