Session Model
This document is the canonical reference for Helix Agents' session- centric storage model. The framework uses sessionId as the primary key for all state operations.
For framework-level concepts (Run, Agent, Tool, Sub-Agents, etc.) see ./concepts.md. For the actual step-by-step execution loop see ./execution-flow.md.
The framework uses a session-centric storage model where sessionId is the primary key for all state operations.
Session vs Run
Session: A conversation container. Identified by
sessionId. Contains all messages, custom state, and checkpoints.Run: A single execution within a session. When a session is interrupted, suspended (HITL), or resumed, a new run starts but continues the same session. Each run captures
startSequencefrom the stream to enable run-scoped chunk filtering (prevents content duplication when refreshing mid-stream in multi-run sessions).As of v7,
RunStatusincludes three suspension variants —'suspended_client_tool','suspended_awaiting_children','suspended_step_partial'— written by runtimes that suspend at HITL boundaries. These mirrorRunOutcome.kinddiscriminators surfaced viaAgentResult.status. See./concepts.mdfor the full HITL model.On Temporal and Cloudflare Workflows, distinct runs within a single session are tagged with the
__resume-Nworkflow-id suffix convention (${prefix}__${agentType}__${sessionId}__resume-${N}, single-dash; spec §5). The counter lives onSessionState.resumeCountand is incremented atomically viaincrementResumeCount. See./concepts.md§Client-Executed Tools for per-runtime resume mechanics.
Key Benefits
- Efficient Message Storage: Messages are stored once per session, not duplicated per run. This is O(n) storage vs O(n²) for run-centric models.
- Natural Conversation Continuity: Reusing the same
sessionIdautomatically continues the conversation with full history. - Clean Sub-Agent Isolation: Each sub-agent gets its own
sessionId, preventing state conflicts.
Usage
// Start a new session (sessionId is required)
const sessionId = `session-${Date.now()}`;
const handle = await executor.execute(agent, { message: 'Hello' }, { sessionId });
// Continue the same session (pass the same sessionId)
const handle2 = await executor.execute(
agent,
{ message: 'Follow up' },
{
sessionId: handle.sessionId,
}
);
// Branch from a checkpoint (creates a new session from existing state)
const newSessionId = `session-${Date.now()}`;
const handle3 = await executor.execute(
agent,
{ message: 'What if...' },
{
sessionId: newSessionId,
branch: { fromSessionId: handle.sessionId, checkpointId: 'cp_123' },
}
);v7 SessionState Shape
The full SessionState<TState, TOutput> interface lives at packages/core/src/types/session.ts:86-295. v7 added a number of suspension- and concurrency-related fields. The canonical shape is:
interface SessionState<TState, TOutput> {
// Identity
sessionId: string;
agentType: string;
streamId?: string;
// Custom application state + status
customState: TState;
status: SessionStatus; // 'active' | 'completed' | 'failed' | 'interrupted' | 'paused'
stepCount: number;
output?: TOutput;
error?: string;
// v7: γ-cascade discriminator. Currently 'parent_suspended' marks a
// child that was failed because its parent suspended; the cascade in
// applyResultsAndReload re-spawns these on parent resume.
failureReason?: string;
// Interrupt context (set when status === 'interrupted' or 'paused')
interruptContext?: InterruptContext;
// v7 HITL suspension state
pendingClientToolCalls?: Record<string, PendingClientToolCall>;
suspendedAwaitingChildren?: Record<string, SuspendedChildWait>;
suspendedStepId?: string;
completedClientToolCalls?: Record<string, number>; // root-only
clientToolCallOwnership?: ClientToolCallOwnership; // root-only
// v7 tracing continuity (sessionId-seeded)
tracingContext?: { traceId: string; rootSpanId: string };
// v7 session GC + cross-session links
expiresAt?: number;
parentSessionId?: string;
rootSessionId?: string;
// v7 DBOS write-once mode binding
mode?: 'standard' | 'persistent';
// v7 distributed coordination
version: number; // monotonic; incremented on every modification
resumeCount: number; // counter for unique resume workflow IDs
// Checkpoint tracking
checkpointId?: string;
checkpointedAt?: number;
checkpointSource?: 'staging' | 'save';
// User context
userId?: string;
tags?: string[];
metadata?: Record<string, string>;
// v7 persisted workspaces (so refs survive interrupt/resume)
workspaceRefs?: Record<string, WorkspaceRef>;
// Timestamps
createdAt: number;
updatedAt: number;
}v7-NEW field summary
| Field | Purpose |
|---|---|
failureReason | γ-cascade discriminator (e.g. 'parent_suspended'); used by applyResultsAndReload to decide re-spawn vs. drain. |
pendingClientToolCalls | Map of toolCallId → pending entry; canonical signal for "awaiting client submission". |
suspendedAwaitingChildren | Map of parentToolCallId → child wait info; populated when parent paused awaiting sub-agents. |
suspendedStepId | Mid-step suspension marker for mixed server+client tool batches. |
completedClientToolCalls | Root-only timestamp map; makes 'already_completed' durable across runtime restarts. |
clientToolCallOwnership | Root-only toolCallId → owningSessionId; routes submissions to the owning sub-agent. |
tracingContext | sessionId-seeded traceId + rootSpanId; one trace per session across runs. |
expiresAt | Operator GC hint for abandoned sessions. |
mode | Write-once 'standard' / 'persistent' binding (DBOS-enforced). |
version / resumeCount | Optimistic concurrency + unique resume workflow IDs. |
workspaceRefs | Persisted workspace refs (so they survive interrupt/resume cycles). |
parentSessionId / rootSessionId | Sub-agent cross-session linkage; rootSessionId enables O(1) ownership writes. |
State Store Interface
All state stores implement SessionStateStore (defined at packages/core/src/store/state-store.ts). v7 introduces several new atomic primitives that runtime code now depends on heavily.
Lifecycle
createSession(sessionId, options)— Atomically create a session (throws if already exists). All implementations guarantee exactly-one-wins semantics for concurrent calls with the same sessionId.sessionExists(sessionId)/deleteSession(sessionId)/cloneSession(...)— standard lifecycle helpers.
State
loadState(sessionId)/saveState(sessionId, state)— Load / save session state.mergeCustomState(sessionId, changes)— Atomically mergeMergeChangesfromImmerStateTrackerinto custom state.updateStatus(sessionId, status, context?)— Atomic status update (no CAS).
v7 atomic primitives
compareAndSetStatus(sessionId, expectedStatuses, newStatus, options?)— Atomic CAS on session status (and optionalexpectedVersion). Returns a discriminated result:{ ok: true; newVersion: number }on success{ ok: false; currentStatus: SessionStatus; currentVersion: number }on mismatchoptionsacceptsinterruptContext,error, andexpectedVersion. Used to prevent double-resume races.
saveStateAndPromoteStaging(sessionId, state, appendMessages, checkpointMeta, options?)— Atomic write of state + appended messages + staging promotion + checkpoint creation in one operation. HonorsexpectedVersion(throwsStaleStateErroron mismatch). Cross-runtime invariant C-1: when a runtime suspends, this is the single primitive that persists pending tool calls, ownership, completed phase-1 messages, the checkpoint, andsuspendedStepIdatomically.incrementStepCount(sessionId)/incrementResumeCount(sessionId)— Atomic counters.
Interrupt flag
setInterruptFlag(sessionId, reason?)— Durable interrupt request (writes durably so other processes can observe it).checkInterruptFlag(sessionId)— Atomic check-and-clear; polled by the runLoop at the top of every step iteration. Foundation for cross-process interrupt parity (JS, CF DO, CFW Workflows all rely on it).clearInterruptFlag(sessionId)— Explicit clear (rarely used directly;checkInterruptFlagclears as part of the read).
Messages, runs, checkpoints, sub-sessions, staging
appendMessages/getMessages/getMessageCount/truncateMessagescreateRun/updateRunStatus/getCurrentRun/listRuns/getRuncreateCheckpoint/getLatestCheckpoint/getCheckpoint/listCheckpointsaddSubSessionRefs/updateSubSessionRef/getSubSessionRefsstageChanges/getStagedChanges/promoteStaging/discardStaging/hasStagedChanges/cleanupOrphanedStaging
Optional extensions
patchMetadata?(sessionId, patch)— Used by runtime-dbos persistent mode to record the active DBOS workflow ID.
Third-party stores: atomic implementation required
There is no non-atomic fallback. A previously-exported defaultSaveStateAndPromoteStaging(store, ...) helper (sequential appendMessages → saveState → promoteStaging) was removed in P3.R3-BC-FALLBACK because the crash-between-calls window it created is exactly the corruption the atomic primitive was added to prevent. All five in-tree stores (memory, redis, postgres, D1, DO) implement the atomic version; custom stores must do the same.