Vercel AI SDK Adapter

The Vercel AI SDK adapter (@helix-agents/llm-vercel) connects Helix Agents to any LLM provider supported by the Vercel AI SDK. This is the recommended adapter for most applications.

When to Use

Good fit:

Production applications
Multiple provider support needed
Streaming responses required
Using OpenAI, Anthropic, Google, or other major providers

Not ideal for:

Unit testing (use MockLLMAdapter instead)
Custom/private LLM APIs not in Vercel AI SDK

Installation

bash

npm install @helix-agents/llm-vercel ai

Also install provider packages for your chosen models:

bash

# OpenAI
npm install @ai-sdk/openai

# Anthropic
npm install @ai-sdk/anthropic

# Google
npm install @ai-sdk/google

Basic Usage

typescript

import { VercelAIAdapter } from '@helix-agents/llm-vercel';
import { JSAgentExecutor } from '@helix-agents/runtime-js';
import { InMemoryStateStore, InMemoryStreamManager } from '@helix-agents/store-memory';
import { openai } from '@ai-sdk/openai';

// Create adapter
const adapter = new VercelAIAdapter();

// Create executor
const executor = new JSAgentExecutor(
  new InMemoryStateStore(),
  new InMemoryStreamManager(),
  adapter
);

// Define agent with Vercel AI SDK model
const agent = defineAgent({
  name: 'assistant',
  systemPrompt: 'You are a helpful assistant.',
  llmConfig: {
    model: openai('gpt-4o'),
    temperature: 0.7,
  },
});

Supported Providers

The Vercel AI SDK supports many providers:

Provider	Package	Example Model
OpenAI	`@ai-sdk/openai`	`openai('gpt-4o')`
Anthropic	`@ai-sdk/anthropic`	`anthropic('claude-sonnet-4-20250514')`
Google	`@ai-sdk/google`	`google('gemini-1.5-pro')`
Cohere	`@ai-sdk/cohere`	`cohere('command-r-plus')`
Mistral	`@ai-sdk/mistral`	`mistral('mistral-large-latest')`
Amazon Bedrock	`@ai-sdk/amazon-bedrock`	Various models
Azure OpenAI	`@ai-sdk/azure`	Azure-hosted models

See the Vercel AI SDK documentation for the full list.

Configuration

Model Configuration

typescript

const agent = defineAgent({
  name: 'my-agent',
  systemPrompt: 'You are a helpful assistant.',
  llmConfig: {
    // Required: The model to use
    model: openai('gpt-4o'),

    // Generation parameters
    temperature: 0.7, // 0-2, higher = more creative
    maxOutputTokens: 4096, // Maximum tokens to generate
    topP: 0.95, // Nucleus sampling
    topK: 40, // Top-k sampling

    // Penalties
    presencePenalty: 0, // Reduce repetition of topics
    frequencyPenalty: 0, // Reduce repetition of tokens

    // Control
    stopSequences: ['END'], // Stop generation at these sequences
    seed: 12345, // For deterministic outputs

    // Reliability
    maxRetries: 3, // Retry on transient failures

    // import { anthropicCache } from '@helix-agents/core';
    // Prompt caching is opt-in. Supply a provider-specific strategy from
    // `@helix-agents/core` that matches your model (here: Anthropic).
    cache: anthropicCache({ ttl: '1h' }),
  },
});

Provider-Specific Options

Important: Reasoning features require AI SDK provider packages v3+:
@ai-sdk/openai@^3.0.0
@ai-sdk/anthropic@^3.0.0
Earlier v2.x versions use specificationVersion: "v2" which triggers compatibility mode in AI SDK v6, stripping reasoning features.

Enable features specific to certain providers:

typescript

// OpenAI o-series reasoning
const agent = defineAgent({
  name: 'reasoning-agent',
  systemPrompt: 'Solve complex problems step by step.',
  llmConfig: {
    model: openai('o1'),
    providerOptions: {
      openai: {
        reasoningSummary: 'detailed',
        reasoningEffort: 'high',
      },
    },
  },
});

// Anthropic extended thinking
const agent = defineAgent({
  name: 'thinking-agent',
  systemPrompt: 'Think through problems carefully.',
  llmConfig: {
    model: anthropic('claude-sonnet-4-20250514'),
    providerOptions: {
      anthropic: {
        thinking: {
          type: 'enabled',
          budgetTokens: 10000,
        },
      },
    },
  },
});

Dynamic Configuration

Override LLM config based on agent state:

typescript

const agent = defineAgent({
  name: 'adaptive-agent',
  stateSchema: z.object({
    complexity: z.enum(['simple', 'complex']),
    stepCount: z.number(),
  }),
  llmConfig: {
    model: openai('gpt-4o-mini'),
    temperature: 0.5,
  },
  llmConfigOverride: (customState, stepCount) => {
    // Use more powerful model for complex tasks
    if (customState.complexity === 'complex') {
      return {
        model: openai('gpt-4o'),
        temperature: 0.2,
        maxOutputTokens: 8192,
      };
    }

    // Increase temperature over time for variety
    if (stepCount > 5) {
      return { temperature: 0.8 };
    }

    return {};
  },
});

Prompt Caching

Prompt caching reduces cost and latency by reusing cached prompt prefixes across LLM calls. It is opt-in and provider-specific: you choose a cache strategy that matches your model and set it on llmConfig.cache. When cache is unset, no caching is applied. The framework performs no provider detection — pick the helper that matches your model.

typescript

import { anthropicCache } from '@helix-agents/core';

const agent = defineAgent({
  name: 'cached-agent',
  systemPrompt: 'You are a helpful assistant with detailed instructions...',
  llmConfig: {
    model: anthropic('claude-sonnet-4-20250514'),
    cache: anthropicCache({ ttl: '1h' }),
  },
});

Shipped strategies

Helper	Provider	What it does
`anthropicCache({ ttl })`	Anthropic (Claude)	Places `cache_control` markers on the system prompt, the tool definitions, and a rolling pair of conversation breakpoints. `ttl` is `'<N>m'` / `'<N>h'` (default `'1h'`), passed through to the provider for validation.
`openaiCache()`	OpenAI (GPT-4o, o-series, …)	Sets `providerOptions.openai.promptCacheKey` from the session ID, giving repeated requests in a session cache affinity.
`xaiCache()`	xAI / Grok	Sets the `x-grok-conv-id` header from the session ID for conversation-level cache routing.

Google / Gemini needs no helper — it uses implicit prefix caching server-side, so there is nothing to annotate. There is intentionally no googleCache().

Each helper only does its provider's thing; the framework never detects providers, so the helper you choose must match the model you configured.

How `anthropicCache` places breakpoints

anthropicCache({ ttl }) marks, up to Anthropic's four-breakpoint limit:

The last system message (caches the system prompt).
The last tool definition (caches the tool schema).
The end of the most recent turn — including tool-result turns — so the next step reads the entire prior prefix from cache.
The end of the previous turn — a rolling second anchor. Anthropic only looks back ~20 content blocks from a breakpoint to find a prior cache entry, so a single tool-heavy turn (many parallel tool calls + results) could push the latest-turn breakpoint out of range and silently re-process the whole history. The previous-turn anchor sits where the prior request's breakpoint landed, keeping the history cached regardless of how large the latest turn is.

Composing strategies

cache also accepts an array of strategies, applied in order — useful for layering a provider strategy with your own:

typescript

llmConfig: {
  model: anthropic('claude-sonnet-4-20250514'),
  cache: [anthropicCache(), myCustomStrategy],
}

Writing a custom strategy

A CacheStrategy is a pure function (CacheRequest) => CacheResult. It can annotate messages/tools (returned in messages / tools) or add request-level options (providerOptions / headers). The CacheStrategy, CacheRequest, and CacheResult types — and applyCacheStrategies, the provider-agnostic folder the runtimes use — are exported from @helix-agents/core:

typescript

import type { CacheStrategy } from '@helix-agents/core';

const tagConversation: CacheStrategy = ({ context }) => ({
  headers: { 'x-my-conv-id': context.sessionId },
});

Cache Token Tracking

Cache hit/miss metrics flow through the standard token usage pipeline:

typescript

// In afterLLMCall hook
hooks: {
  afterLLMCall: (payload, ctx) => {
    if (payload.usage) {
      console.log(`Prompt tokens: ${payload.usage.promptTokens}`);
      console.log(`Cached tokens: ${payload.usage.cachedTokens}`);      // Cache hits
      console.log(`Cache writes: ${payload.usage.cacheWriteTokens}`);   // New cache entries
    }
  },
}

Cache tokens also appear in:

Stream chunks: step_end chunks include cachedTokens and cacheWriteTokens in their usage field
Usage tracking: The TokenCounts rollup includes cached and cacheWrite fields
Langfuse tracing: Mapped to cache_read_input_tokens and cache_creation_input_tokens

Streaming

The adapter supports real-time streaming:

typescript

// Streaming happens automatically in execute()
const handle = await executor.execute(agent, 'Research AI agents');

// Get the stream
const stream = await handle.stream();
if (stream) {
  for await (const chunk of stream) {
    switch (chunk.type) {
      case 'text_delta':
        process.stdout.write(chunk.delta);
        break;
      case 'thinking':
        console.log('[Thinking]', chunk.content);
        break;
      case 'tool_start':
        console.log(`[Tool: ${chunk.toolName}]`);
        break;
    }
  }
}

Chunk Mapping

The adapter maps Vercel AI SDK stream parts to framework chunks:

Vercel AI SDK	Framework	Notes
`text-delta`	`text_delta`	Generated text tokens
`reasoning-delta`	`thinking`	Reasoning/thinking content
`tool-input-start`	`tool_start`	Tool call begins
`tool-call`	`tool_start`	Complete tool call
`tool-result`	`tool_end`	Tool result
`error`	`error`	Generation error

Thinking/Reasoning Content

Both Anthropic and OpenAI support reasoning features:

Anthropic Extended Thinking

typescript

const agent = defineAgent({
  name: 'claude-thinker',
  llmConfig: {
    model: anthropic('claude-sonnet-4-20250514'),
    providerOptions: {
      anthropic: {
        thinking: {
          type: 'enabled',
          budgetTokens: 10000, // Token budget for thinking
        },
      },
    },
  },
});

// Thinking content streams via 'thinking' chunks
for await (const chunk of stream) {
  if (chunk.type === 'thinking') {
    console.log('[Claude thinking...]', chunk.content);
  }
}

OpenAI Reasoning

typescript

const agent = defineAgent({
  name: 'o1-reasoner',
  llmConfig: {
    model: openai('o1'),
    providerOptions: {
      openai: {
        reasoningSummary: 'detailed', // or 'concise'
        reasoningEffort: 'high', // or 'medium', 'low'
      },
    },
  },
});

Message Conversion

The adapter converts framework messages to Vercel AI SDK format:

Framework → Vercel AI SDK

typescript

// Framework format
const messages: Message[] = [
  { role: 'system', content: 'You are helpful.' },
  { role: 'user', content: 'Hello' },
  {
    role: 'assistant',
    content: 'I will search for that.',
    toolCalls: [{ id: 'tc1', name: 'search', arguments: { q: 'test' } }],
  },
  {
    role: 'tool',
    toolCallId: 'tc1',
    toolName: 'search',
    content: JSON.stringify({ results: [] }),
  },
];

// Automatically converted to Vercel AI SDK ModelMessage[]

The conversion handles:

System, user, and assistant messages
Tool calls in assistant messages
Tool results in tool messages
Mixed text + tool call content

The adapter also coerces non-object tool-call inputs to objects — at ingestion, in stream-chunk mapping, and during message conversion (which heals any string arguments already persisted in durable stores before this guarantee existed). This complements the runtime-agnostic coercion in core's planStepProcessing(); together they ensure a tool input is never replayed as a non-object tool_use.input (which the provider rejects). Structured output is left to core's schema-aware repair, so non-object outputSchemas are preserved. See Robust tool inputs.

Tool Conversion

Framework tools (with Zod schemas) are converted to Vercel AI SDK tools:

typescript

// Framework tool
const searchTool = defineTool({
  name: 'search',
  description: 'Search the web',
  inputSchema: z.object({
    query: z.string(),
    limit: z.number().optional(),
  }),
  execute: async (input, ctx) => {
    // ...
  },
});

// Automatically converted to Vercel AI SDK tool format
// The Zod schema is passed directly (AI SDK 5.x supports Zod)

Error Handling

The adapter handles errors gracefully:

typescript

const adapter = new VercelAIAdapter({
  logger: console, // Optional: log warnings
});

// Errors are returned as ErrorStepResult, not thrown
const result = await adapter.generateStep(input);

if (result.type === 'error') {
  console.error('LLM error:', result.error.message);
  // Framework handles this appropriately
}

Retry Configuration

Configure retries for transient failures:

typescript

const agent = defineAgent({
  llmConfig: {
    model: openai('gpt-4o'),
    maxRetries: 5, // Retry up to 5 times on transient errors
  },
});

Logger Integration

Pass a custom logger for debug output:

typescript

import { VercelAIAdapter } from '@helix-agents/llm-vercel';

const logger = {
  debug: (msg: string) => console.debug(`[DEBUG] ${msg}`),
  info: (msg: string) => console.info(`[INFO] ${msg}`),
  warn: (msg: string) => console.warn(`[WARN] ${msg}`),
  error: (msg: string) => console.error(`[ERROR] ${msg}`),
};

const adapter = new VercelAIAdapter({ logger });

Complete Example

typescript

import { defineAgent, defineTool } from '@helix-agents/core';
import { JSAgentExecutor } from '@helix-agents/runtime-js';
import { InMemoryStateStore, InMemoryStreamManager } from '@helix-agents/store-memory';
import { VercelAIAdapter } from '@helix-agents/llm-vercel';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

// Create adapter
const adapter = new VercelAIAdapter();

// Define tool
const searchTool = defineTool({
  name: 'web_search',
  description: 'Search the web for information',
  inputSchema: z.object({
    query: z.string().describe('Search query'),
  }),
  outputSchema: z.object({
    results: z.array(z.string()),
  }),
  execute: async (input) => {
    // Simulate search
    return { results: [`Result for: ${input.query}`] };
  },
});

// Define agent
const ResearchAgent = defineAgent({
  name: 'researcher',
  description: 'Researches topics using web search',
  systemPrompt: `You are a research assistant.
Use the web_search tool to find information.
Summarize your findings clearly.`,
  tools: [searchTool],
  outputSchema: z.object({
    summary: z.string(),
    sources: z.array(z.string()),
  }),
  llmConfig: {
    model: openai('gpt-4o'),
    temperature: 0.3,
    maxOutputTokens: 2048,
  },
});

// Create executor
const executor = new JSAgentExecutor(
  new InMemoryStateStore(),
  new InMemoryStreamManager(),
  adapter
);

// Execute
async function main() {
  const handle = await executor.execute(
    ResearchAgent,
    'What are the latest developments in AI agents?'
  );

  // Stream output
  const stream = await handle.stream();
  if (stream) {
    for await (const chunk of stream) {
      if (chunk.type === 'text_delta') {
        process.stdout.write(chunk.delta);
      }
    }
  }

  // Get result
  const result = await handle.result();
  console.log('\n\nResult:', result.output);
}

main();

Limitations

Model-Specific Features

Not all features work with all models:

Thinking/reasoning: Only Anthropic Claude and OpenAI o-series
Tool calling: Most models, but check provider docs
JSON mode: Provider-specific implementation

Token Counting

The adapter doesn't provide token counting. Use provider SDKs directly for token estimation.

Image/Multimodal

The framework supports file uploads (images, PDFs, etc.) via the files field in AgentInput. Files are converted to ContentPart[] alongside the text message and passed to the LLM. This works across all runtimes (JS, Temporal, Cloudflare).

typescript

await executor.execute(
  agent,
  {
    message: 'Describe this image',
    files: [
      {
        data: base64EncodedData,
        mediaType: 'image/png',
        filename: 'screenshot.png', // optional
      },
    ],
  },
  { sessionId }
);

Next Steps

LLM Overview - Understanding the adapter interface
Custom Adapters - Building your own adapter
Streaming - Real-time streaming deep dive

Vercel AI SDK Adapter ​

When to Use ​

Installation ​

Basic Usage ​

Supported Providers ​

Configuration ​

Model Configuration ​

Provider-Specific Options ​

Dynamic Configuration ​

Prompt Caching ​

Shipped strategies ​

How anthropicCache places breakpoints ​

Composing strategies ​

Writing a custom strategy ​

Cache Token Tracking ​

Streaming ​

Chunk Mapping ​

Thinking/Reasoning Content ​

Anthropic Extended Thinking ​

OpenAI Reasoning ​

Message Conversion ​

Framework → Vercel AI SDK ​

Tool Conversion ​

Error Handling ​

Retry Configuration ​

Logger Integration ​

Complete Example ​

Limitations ​

Model-Specific Features ​

Token Counting ​

Image/Multimodal ​

Next Steps ​

Vercel AI SDK Adapter

When to Use

Installation

Basic Usage

Supported Providers

Configuration

Model Configuration

Provider-Specific Options

Dynamic Configuration

Prompt Caching

Shipped strategies

How `anthropicCache` places breakpoints

Composing strategies

Writing a custom strategy

Cache Token Tracking

Streaming

Chunk Mapping

Thinking/Reasoning Content

Anthropic Extended Thinking

OpenAI Reasoning

Message Conversion

Framework → Vercel AI SDK

Tool Conversion

Error Handling

Retry Configuration

Logger Integration

Complete Example

Limitations

Model-Specific Features

Token Counting

Image/Multimodal

Next Steps