Skip to content

Usage Tracking

Track LLM token consumption, tool executions, sub-agent calls, and custom metrics across your agent runs. Usage tracking provides visibility into costs, performance, and resource utilization.

Quick Start

Pass a UsageStore when executing an agent to enable tracking:

typescript
import { JSAgentExecutor, defineAgent } from '@helix-agents/sdk';
import { InMemoryUsageStore } from '@helix-agents/store-memory';

// Create stores
const stateStore = new InMemoryStateStore();
const streamManager = new InMemoryStreamManager();
const usageStore = new InMemoryUsageStore();

// Create executor
const executor = new JSAgentExecutor(stateStore, streamManager, llmAdapter);

// Execute with usage tracking
const handle = await executor.execute(agent, 'Analyze this data', { usageStore });

// Wait for completion
await handle.result();

// Get aggregated usage
const rollup = await handle.getUsageRollup();
console.log(`Total tokens: ${rollup.tokens.total}`);
console.log(`Tool calls: ${rollup.toolStats.totalCalls}`);

Core Concepts

Usage Entries

Usage tracking records four types of entries:

KindDescriptionWhen Recorded
tokensLLM token consumptionAfter each LLM call
toolTool execution statsAfter each tool completes
subagentSub-agent invocationAfter sub-agent finishes
customUser-defined metricsVia context.recordUsage()

Each entry includes:

  • id - Unique entry identifier
  • runId - Agent run this entry belongs to
  • timestamp - When the entry was recorded (epoch ms)
  • stepCount - Which step generated this entry
  • source - What generated this entry (agent, tool, or subagent)

Token Usage Entry

Recorded after each LLM call:

typescript
interface TokenUsageEntry {
  kind: 'tokens';
  model: string;           // e.g., 'gpt-4o', 'claude-3-opus'
  tokens: {
    prompt?: number;       // Input tokens
    completion?: number;   // Output tokens
    reasoning?: number;    // Reasoning tokens (o1, etc.)
    cached?: number;       // Cached/retrieved tokens
    total?: number;        // Total tokens
  };
  // ... base fields
}

Tool Usage Entry

Recorded after each tool execution:

typescript
interface ToolUsageEntry {
  kind: 'tool';
  toolName: string;        // Tool that was executed
  durationMs: number;      // Execution time in milliseconds
  success: boolean;        // Whether execution succeeded
  error?: string;          // Error message if failed
  // ... base fields
}

Sub-Agent Usage Entry

Recorded when a sub-agent completes:

typescript
interface SubAgentUsageEntry {
  kind: 'subagent';
  subAgentType: string;    // Sub-agent name
  subAgentRunId: string;   // Sub-agent's run ID (for lazy lookup)
  durationMs: number;      // Total sub-agent execution time
  success: boolean;        // Whether sub-agent completed successfully
  error?: string;          // Error message if failed
  // ... base fields
}

Custom Usage Entry

Recorded via context.recordUsage() in tools:

typescript
interface CustomUsageEntry {
  kind: 'custom';
  type: string;            // Metric category (e.g., 'api_calls')
  name: string;            // Metric instance (e.g., 'tavily')
  value: number;           // Numeric value
  // ... base fields
}

UsageStore Interface

All usage stores implement this interface:

typescript
interface UsageStore {
  // Record a single entry
  recordEntry(entry: UsageEntryInput): Promise<void>;

  // Retrieve entries for a run (with optional filtering)
  getEntries(runId: string, filter?: EntryFilter): Promise<UsageEntry[]>;

  // Get aggregated usage rollup
  getRollup(runId: string, options?: RollupOptions): Promise<UsageRollup | null>;
}

Note: Individual store implementations may provide additional utility methods like exists(), delete(), and clear(), but these are not part of the core UsageStore interface.

UsageRollup Structure

The rollup aggregates all entries into a summary:

typescript
interface UsageRollup {
  runId: string;

  // Token totals for THIS agent only
  tokens: TokenCounts;
  tokensByModel: Record<string, TokenCounts>;

  // Tokens including sub-agents (computed lazily)
  tokensIncludingSubAgents: TokenCounts;

  // Tool execution statistics
  toolStats: {
    totalCalls: number;
    successfulCalls: number;
    failedCalls: number;
    totalDurationMs: number;
    byTool: Record<string, ToolStats>;
  };

  // Sub-agent invocation statistics
  subAgentStats: {
    totalCalls: number;
    successfulCalls: number;
    failedCalls: number;
    totalDurationMs: number;
    byType: Record<string, SubAgentStats>;
  };

  // Custom metrics aggregated by type -> name -> value
  custom: Record<string, Record<string, number>>;

  // Timing
  startedAt?: number;
  lastUpdatedAt?: number;
  entryCount: number;
}

Lazy Sub-Agent Aggregation

Sub-agent tokens are not copied into the parent at recording time. Instead:

  1. Parent records a subagent entry with the child's runId
  2. When you call getUsageRollup(), the store:
    • Looks up each sub-agent's runId from the entries
    • Fetches the sub-agent's rollup recursively
    • Aggregates tokens and custom metrics into tokensIncludingSubAgents

This lazy approach avoids data duplication and handles deep nesting correctly.

typescript
// Get parent's own tokens only (no sub-agent aggregation)
const parentOnly = await handle.getUsage();
console.log(parentOnly.tokens.total); // Parent's tokens only

// Get combined tokens (parent + all sub-agents recursively)
const combined = await handle.getUsageRollup();
console.log(combined.tokensIncludingSubAgents.total); // All tokens including sub-agents

Recording Custom Metrics

Tools can record custom usage metrics via context.recordUsage(). This is useful for tracking API calls, bytes processed, credits consumed, or any domain-specific metric.

Basic Usage

typescript
const searchTool = defineTool({
  name: 'tavily_search',
  inputSchema: z.object({ query: z.string() }),
  execute: async (input, context) => {
    const results = await tavilySearch(input.query);

    // Record custom metrics
    await context.recordUsage?.({ type: 'api_calls', name: 'tavily' }, 1);
    await context.recordUsage?.({ type: 'credits', name: 'search' }, 0.01);

    return results;
  },
});

The metrics are stored as custom entries and aggregated in the rollup:

typescript
const rollup = await handle.getUsageRollup();
console.log(rollup.custom['api_calls']['tavily']); // 1
console.log(rollup.custom['credits']['search']);   // 0.01

Type-Safe Metrics with defineMetric()

For better type safety and documentation, define metric types:

typescript
import { defineMetric } from '@helix-agents/core';

// Define typed metrics
const API_CALLS = defineMetric('api_calls', 'External API call count');
const BYTES_PROCESSED = defineMetric('bytes.processed', 'Bytes processed');
const CREDITS_USED = defineMetric('credits', 'Service credits consumed');

// Use in tools
const tool = defineTool({
  name: 'process_file',
  execute: async (input, context) => {
    const data = await fetchFile(input.url);

    // Type-safe recording
    await context.recordUsage?.(API_CALLS, 'cdn', 1);
    await context.recordUsage?.(BYTES_PROCESSED, 'download', data.length);

    return processData(data);
  },
});

Framework Built-in Metrics

The framework provides constants for common token metrics:

typescript
import {
  METRIC_TOKENS_INPUT,     // 'tokens.input'
  METRIC_TOKENS_OUTPUT,    // 'tokens.output'
  METRIC_TOKENS_REASONING, // 'tokens.reasoning'
  METRIC_TOKENS_CACHED,    // 'tokens.cached'
} from '@helix-agents/core';

These are used internally by the framework but you can reference them in custom logic.

Metric Naming Conventions

Use a consistent naming scheme:

typescript
// Category-based (recommended)
{ type: 'api_calls', name: 'tavily' }
{ type: 'api_calls', name: 'openai' }
{ type: 'bytes', name: 'input' }
{ type: 'bytes', name: 'output' }

// Hierarchical types
{ type: 'llm.embeddings', name: 'openai' }
{ type: 'llm.completions', name: 'anthropic' }
{ type: 'storage.read', name: 's3' }
{ type: 'storage.write', name: 's3' }

The rollup aggregates by type first, then by name:

typescript
rollup.custom = {
  'api_calls': { 'tavily': 5, 'openai': 3 },
  'bytes': { 'input': 10240, 'output': 2048 },
};

Retrieving Usage Data

Getting Raw Entries

Use handle.getUsage() to get raw entries:

typescript
const entries = await handle.getUsage();
for (const entry of entries) {
  switch (entry.kind) {
    case 'tokens':
      console.log(`LLM call: ${entry.model}, ${entry.tokens.total} tokens`);
      break;
    case 'tool':
      console.log(`Tool: ${entry.toolName}, ${entry.durationMs}ms`);
      break;
    case 'custom':
      console.log(`Custom: ${entry.type}/${entry.name} = ${entry.value}`);
      break;
  }
}

Filtering Entries

Filter entries by kind, step range, or time range:

typescript
// Only token entries
const tokens = await usageStore.getEntries(runId, {
  kinds: ['tokens']
});

// Entries from steps 5-10
const midRun = await usageStore.getEntries(runId, {
  stepRange: { min: 5, max: 10 }
});

// Entries from the last hour
const recent = await usageStore.getEntries(runId, {
  timeRange: {
    start: Date.now() - 3600000,
    end: Date.now()
  }
});

// Pagination
const page = await usageStore.getEntries(runId, {
  limit: 10,
  offset: 20
});

Getting Aggregated Rollup

Use handle.getUsageRollup() for aggregated stats:

typescript
const rollup = await handle.getUsageRollup();

// Token summary
console.log('Token usage:');
console.log(`  Prompt: ${rollup.tokens.prompt}`);
console.log(`  Completion: ${rollup.tokens.completion}`);
console.log(`  Total: ${rollup.tokens.total}`);

// By model
for (const [model, counts] of Object.entries(rollup.tokensByModel)) {
  console.log(`  ${model}: ${counts.total} tokens`);
}

// Tool stats
console.log('Tool usage:');
console.log(`  Total calls: ${rollup.toolStats.totalCalls}`);
console.log(`  Success rate: ${rollup.toolStats.successfulCalls / rollup.toolStats.totalCalls}`);
for (const [tool, stats] of Object.entries(rollup.toolStats.byTool)) {
  console.log(`  ${tool}: ${stats.calls} calls, ${stats.totalDurationMs}ms total`);
}

// Custom metrics
console.log('Custom metrics:');
for (const [type, names] of Object.entries(rollup.custom)) {
  for (const [name, value] of Object.entries(names)) {
    console.log(`  ${type}/${name}: ${value}`);
  }
}

Including Sub-Agent Usage

The handle provides two methods for retrieving usage:

typescript
// getUsage() - Parent only (no sub-agent aggregation)
const parentRollup = await handle.getUsage();

// getUsageRollup() - Parent + all sub-agents (recursive)
const totalRollup = await handle.getUsageRollup();

console.log(`Parent tokens: ${parentRollup?.tokens.total}`);
console.log(`Total tokens: ${totalRollup?.tokensIncludingSubAgents.total}`);

Store Implementations

InMemoryUsageStore

For development and testing. Data is lost when process exits.

typescript
import { InMemoryUsageStore } from '@helix-agents/store-memory';

const usageStore = new InMemoryUsageStore();

// Testing utilities
usageStore.clear();              // Clear all data
usageStore.getAllRunIds();       // List all tracked runs
usageStore.size;                 // Number of runs
usageStore.totalEntryCount;      // Total entries across all runs

RedisUsageStore

For production. Persists to Redis with configurable TTL.

typescript
import Redis from 'ioredis';
import { RedisUsageStore } from '@helix-agents/store-redis';

const redis = new Redis(process.env.REDIS_URL);
const usageStore = new RedisUsageStore(redis, {
  keyPrefix: 'myapp',         // Key prefix (default: 'helix')
  ttlSeconds: 86400 * 7,      // Retention period (default: 24h)
});

// Find tracked runs
const runIds = await usageStore.findRunIds();

// Get entry count without fetching entries
const count = await usageStore.getEntryCount(runId);

D1UsageStore

For Cloudflare Workers. Persists to D1 SQLite database.

typescript
import { D1UsageStore } from '@helix-agents/store-cloudflare';

const usageStore = new D1UsageStore({
  database: env.DB,
  tableName: 'usage_entries',  // Optional, default: 'usage_entries'
});

// Cleanup old entries
await usageStore.deleteOldEntries(7 * 24 * 60 * 60 * 1000); // Older than 7 days

// Find runs with filters
const runIds = await usageStore.findRunIds({
  agentType: 'researcher',
  limit: 100
});

Runtime Integration

Usage tracking works identically across all runtimes. Just pass the usageStore option.

JS Runtime

typescript
import { JSAgentExecutor } from '@helix-agents/runtime-js';
import { InMemoryUsageStore } from '@helix-agents/store-memory';

const executor = new JSAgentExecutor(stateStore, streamManager, llmAdapter);
const usageStore = new InMemoryUsageStore();

const handle = await executor.execute(agent, 'Do the task', { usageStore });

Temporal Runtime

typescript
import { TemporalAgentExecutor } from '@helix-agents/runtime-temporal';
import { RedisUsageStore } from '@helix-agents/store-redis';

const executor = new TemporalAgentExecutor(client, stateStore, streamManager, llmAdapter);
const usageStore = new RedisUsageStore(redis);

const handle = await executor.execute(agent, 'Do the task', { usageStore });

Cloudflare Runtime

typescript
import { CloudflareAgentExecutor } from '@helix-agents/runtime-cloudflare';
import { D1UsageStore } from '@helix-agents/store-cloudflare';

const usageStore = new D1UsageStore({ database: env.DB });

const handle = await executor.execute(agent, 'Do the task', { usageStore });

Reconnecting to Existing Runs

Get usage from a previous run using getHandle():

typescript
// Later, reconnect to get usage
const handle = await executor.getHandle(agent, runId, { usageStore });
const rollup = await handle?.getUsageRollup();

Patterns

Cost Tracking

Calculate costs from token usage:

typescript
const PRICING = {
  'gpt-4o': { input: 0.0025, output: 0.01 },
  'gpt-4o-mini': { input: 0.00015, output: 0.0006 },
  'claude-3-opus': { input: 0.015, output: 0.075 },
};

async function calculateCost(handle: AgentExecutionHandle): Promise<number> {
  const rollup = await handle.getUsageRollup({ includeSubAgents: true });
  if (!rollup) return 0;

  let totalCost = 0;

  for (const [model, tokens] of Object.entries(rollup.tokensByModel)) {
    const pricing = PRICING[model];
    if (pricing) {
      totalCost += (tokens.prompt ?? 0) / 1000 * pricing.input;
      totalCost += (tokens.completion ?? 0) / 1000 * pricing.output;
    }
  }

  return totalCost;
}

Token Budgets

Enforce token limits with hooks:

typescript
import { defineAgent } from '@helix-agents/sdk';

const MAX_TOKENS = 100000;

const agent = defineAgent({
  name: 'budget-agent',
  hooks: {
    afterLLMCall: async ({ usage, state }) => {
      if (!usage) return;

      // Track cumulative usage in state
      const current = state.customState?.totalTokens ?? 0;
      const newTotal = current + (usage.totalTokens ?? 0);

      if (newTotal > MAX_TOKENS) {
        throw new Error(`Token budget exceeded: ${newTotal}/${MAX_TOKENS}`);
      }

      // Store for next check (via state update in tool)
    },
  },
  // ...
});

Analytics Integration

Export usage to analytics systems:

typescript
async function exportToAnalytics(runId: string, usageStore: UsageStore) {
  const rollup = await usageStore.getRollup(runId, { includeSubAgents: true });
  if (!rollup) return;

  // Send to your analytics service
  await analytics.track('agent_run_completed', {
    runId,
    totalTokens: rollup.tokensIncludingSubAgents.total,
    tokenCost: calculateCost(rollup),
    toolCalls: rollup.toolStats.totalCalls,
    toolSuccessRate: rollup.toolStats.successfulCalls / rollup.toolStats.totalCalls,
    subAgentCalls: rollup.subAgentStats.totalCalls,
    customMetrics: rollup.custom,
    duration: (rollup.lastUpdatedAt ?? 0) - (rollup.startedAt ?? 0),
  });
}

Per-Tool Cost Tracking

Track costs at the tool level with custom metrics:

typescript
const openAITool = defineTool({
  name: 'openai_embed',
  execute: async (input, context) => {
    const result = await openai.embeddings.create({
      model: 'text-embedding-3-small',
      input: input.text,
    });

    // Record embedding tokens as custom metric
    const tokens = result.usage.total_tokens;
    await context.recordUsage?.({ type: 'embedding_tokens', name: 'openai' }, tokens);

    // Record estimated cost
    const cost = tokens * 0.00002 / 1000;
    await context.recordUsage?.({ type: 'cost', name: 'embeddings' }, cost);

    return result.data[0].embedding;
  },
});

Multi-Tenant Usage Isolation

Use separate stores or key prefixes per tenant:

typescript
// Option 1: Separate stores per tenant
const usageStores = new Map<string, UsageStore>();

function getUsageStore(tenantId: string): UsageStore {
  if (!usageStores.has(tenantId)) {
    usageStores.set(tenantId, new RedisUsageStore(redis, {
      keyPrefix: `tenant:${tenantId}`,
    }));
  }
  return usageStores.get(tenantId)!;
}

// Option 2: Include tenant in runId
const runId = `${tenantId}:${uuid()}`;

Utility Functions

The @helix-agents/core package exports helper functions:

typescript
import {
  createEmptyRollup,
  addTokenCounts,
  hasTokenCounts,
  isTokenUsageEntry,
  isToolUsageEntry,
  isSubAgentUsageEntry,
  isCustomUsageEntry,
} from '@helix-agents/core';

// Create empty rollup structure
const rollup = createEmptyRollup('run-123');

// Add token counts together
const combined = addTokenCounts(
  { prompt: 100, completion: 50 },
  { prompt: 200, completion: 100 }
);
// { prompt: 300, completion: 150 }

// Check if tokens were recorded
if (hasTokenCounts(rollup.tokens)) {
  console.log('Has token data');
}

// Type guards for entry types
for (const entry of entries) {
  if (isTokenUsageEntry(entry)) {
    console.log(entry.model); // TypeScript knows it's TokenUsageEntry
  }
}

Next Steps

Released under the MIT License.