Skip to content

Tracing & Observability

Tracing provides visibility into agent execution for debugging, performance analysis, and cost tracking. Helix Agents integrates with Langfuse for comprehensive LLM observability.

Overview

Tracing captures:

  • Agent Runs - Full execution lifecycle with timing and status
  • LLM Calls - Model, tokens, latency, prompts and responses
  • Tool Executions - Arguments, results, and timing
  • Sub-Agent Calls - Nested traces with parent-child relationships
  • Metadata - User attribution, session grouping, custom tags

Why Trace?

  1. Debugging - Understand why an agent behaved a certain way
  2. Performance - Identify slow LLM calls or inefficient tool usage
  3. Cost Tracking - Monitor token usage across users and features
  4. Quality - Evaluate agent outputs and improve prompts
  5. Compliance - Audit trail of LLM interactions

Quick Start

1. Install the Package

bash
npm install @helix-agents/tracing-langfuse langfuse

2. Set Up Langfuse

Create a Langfuse account and get your API keys:

bash
# .env
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...

3. Add Hooks to Your Agent

typescript
import { createLangfuseHooks } from '@helix-agents/tracing-langfuse';
import { defineAgent, JSAgentExecutor } from '@helix-agents/sdk';

// Create hooks (auto-reads credentials from env)
const { hooks, flush } = createLangfuseHooks();

// Use with agent
const agent = defineAgent({
  name: 'my-agent',
  hooks,
  systemPrompt: 'You are a helpful assistant.',
  llmConfig: { model: { provider: 'openai', name: 'gpt-4o' } },
});

// Execute
const handle = await executor.execute(agent, 'Hello!');
const result = await handle.result;

// Flush in serverless (optional in long-running processes)
await flush();

4. View Traces in Langfuse

Open your Langfuse dashboard to see:

  • Trace timeline with all observations
  • Token usage and costs
  • Latency breakdown
  • Error details

Configuration

Basic Options

typescript
const { hooks } = createLangfuseHooks({
  // Credentials (optional if using env vars)
  publicKey: 'pk-lf-...',
  secretKey: 'sk-lf-...',
  baseUrl: 'https://cloud.langfuse.com', // or self-hosted URL

  // Version tag for filtering
  release: '1.0.0',

  // Default tags for all traces
  defaultTags: ['production', 'v2'],

  // Default metadata for all traces
  defaultMetadata: {
    service: 'chat-api',
    team: 'platform',
  },

  // Debug logging
  debug: false,
});

Data Capture Options

Control what data is sent to Langfuse:

typescript
const { hooks } = createLangfuseHooks({
  // Agent state snapshots (may be large)
  includeState: false,

  // Full conversation messages (may contain PII)
  includeMessages: false,

  // Tool arguments (default: true)
  includeToolArgs: true,

  // Tool results (may be large)
  includeToolResults: false,

  // LLM prompts (default: true)
  includeGenerationInput: true,

  // LLM responses (default: true)
  includeGenerationOutput: true,
});

Privacy

For production systems handling PII, consider disabling includeMessages, includeGenerationInput, and includeGenerationOutput to avoid logging sensitive user data.

Metadata & Tagging

Metadata enables filtering and attribution in Langfuse.

Passing Metadata at Execution

typescript
await executor.execute(agent, input, {
  // User attribution
  userId: 'user-123',

  // Session grouping (e.g., conversation threads)
  sessionId: 'conversation-456',

  // Tags for filtering
  tags: ['premium', 'mobile'],

  // Custom key-value metadata
  metadata: {
    environment: 'production',
    region: 'us-west-2',
    feature: 'chat',
  },
});

Using the Context Builder

For better ergonomics, use the fluent builder:

typescript
import { tracingContext } from '@helix-agents/tracing-langfuse';

const context = tracingContext()
  .user('user-123')
  .session('conversation-456')
  .tags('premium', 'mobile')
  .environment('production')
  .version('1.0.0')
  .metadata('region', 'us-west-2')
  .build();

await executor.execute(agent, input, context);

Typed Metadata

For common metadata patterns, use typed interfaces:

typescript
import { createTracingMetadata } from '@helix-agents/tracing-langfuse';

const metadata = createTracingMetadata({
  environment: 'production',
  version: '1.0.0',
  service: 'chat-api',
  region: 'us-west-2',
  tier: 'premium',
  source: 'mobile',
});

await executor.execute(agent, input, { metadata });

Trace Hierarchy

Every agent run creates a trace with nested observations:

mermaid
graph TB
    subgraph Trace ["trace: my-agent"]
        G1["generation: llm.generation<br/><i>model: gpt-4o, tokens: 1234</i>"]
        T1["span: tool:search<br/><i>args: { query: '...' }</i>"]
        T2["span: tool:calculate"]
        G2["generation: llm.generation"]

        subgraph SubAgent ["span: agent:sub-agent"]
            SG["generation: llm.generation"]
            ST["span: tool:fetch"]
        end
    end

    G1 --> T1 --> T2 --> G2 --> SubAgent
    SG --> ST
  • Trace - Root container, represents the full agent run
  • Generation - LLM call with model, tokens, timing
  • Span - Tool or sub-agent execution

Lifecycle Hooks

Customize observations with lifecycle hooks:

onAgentTraceCreated

Called when the root trace is created:

typescript
const { hooks } = createLangfuseHooks({
  onAgentTraceCreated: ({ runId, agentName, hookContext, updateTrace }) => {
    // Add environment info
    updateTrace({
      metadata: {
        nodeVersion: process.version,
        environment: process.env.NODE_ENV,
      },
    });
  },
});

onGenerationCreated

Called when an LLM generation starts:

typescript
const { hooks } = createLangfuseHooks({
  onGenerationCreated: ({ model, modelParameters, updateGeneration }) => {
    // Tag by provider
    const provider = model?.includes('gpt') ? 'openai' : 'anthropic';
    updateGeneration({
      metadata: { provider },
    });
  },
});

onToolCreated

Called when a tool span starts:

typescript
const { hooks } = createLangfuseHooks({
  onToolCreated: ({ toolName, toolCallId, updateTool }) => {
    // Categorize tools
    const category = toolName.startsWith('db_') ? 'database' : 'external';
    updateTool({
      metadata: { category },
    });
  },
});

onObservationEnding

Called before any observation ends:

typescript
const { hooks } = createLangfuseHooks({
  onObservationEnding: ({ type, observationId, durationMs, success, error }) => {
    if (!success) {
      console.error(`${type} failed after ${durationMs}ms:`, error);
    }
  },
});

Custom Attribute Extraction

Extract attributes from hook context for all observations:

typescript
const { hooks } = createLangfuseHooks({
  extractAttributes: (context) => ({
    stepCount: String(context.stepCount),
    hasParent: String(!!context.parentSessionId),
    // Access execution metadata
    region: context.metadata?.region,
  }),
});

Sub-Agent Tracing

Sub-agents automatically inherit tracing context:

typescript
const researchAgent = defineAgent({
  name: 'researcher',
  // ... config
});

const orchestrator = defineAgent({
  name: 'orchestrator',
  hooks, // Langfuse hooks
  tools: [
    createSubAgentTool({
      name: 'research',
      agent: researchAgent,
      description: 'Delegate research tasks',
    }),
  ],
});

In Langfuse, you'll see:

mermaid
graph TB
    subgraph Trace ["trace: orchestrator"]
        G1["generation: llm.generation"]
        subgraph SubAgent ["span: agent:researcher"]
            SG["generation: llm.generation"]
            ST["span: tool:search"]
        end
    end

    G1 --> SubAgent
    SG --> ST

Sub-agents inherit userId, sessionId, tags, and metadata from the parent.

Serverless Considerations

Langfuse batches events and sends them asynchronously. In serverless environments, flush before the function returns:

typescript
// AWS Lambda / Vercel / Cloudflare Workers
export async function handler(event) {
  const { hooks, flush } = createLangfuseHooks();

  const agent = defineAgent({ hooks, ... });
  const executor = new JSAgentExecutor({ ... });

  const handle = await executor.execute(agent, event.message);
  const result = await handle.result;

  // IMPORTANT: Flush before returning
  await flush();

  return { statusCode: 200, body: JSON.stringify(result) };
}

For graceful shutdown in long-running processes:

typescript
const { hooks, shutdown } = createLangfuseHooks();

process.on('SIGTERM', async () => {
  await shutdown(); // Flushes and closes
  process.exit(0);
});

Self-Hosted Langfuse

To use a self-hosted Langfuse instance:

typescript
const { hooks } = createLangfuseHooks({
  baseUrl: 'https://langfuse.your-company.com',
  publicKey: 'pk-...',
  secretKey: 'sk-...',
});

Or via environment variables:

bash
LANGFUSE_BASEURL=https://langfuse.your-company.com
LANGFUSE_PUBLIC_KEY=pk-...
LANGFUSE_SECRET_KEY=sk-...

Troubleshooting

Traces Not Appearing

  1. Check credentials: Ensure LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are set
  2. Enable debug mode: createLangfuseHooks({ debug: true })
  3. Flush in serverless: Call await flush() before function returns
  4. Check network: Verify connectivity to cloud.langfuse.com

Missing Metadata

Metadata must be passed at execute() time, not in agent definition:

typescript
// WRONG: Agent definition doesn't support execution metadata
const agent = defineAgent({
  metadata: { userId: '123' }, // This won't work!
});

// CORRECT: Pass at execution time
await executor.execute(agent, input, {
  userId: '123',
  metadata: { custom: 'value' },
});

High Memory Usage

If tracing increases memory usage:

  1. Disable state capture: includeState: false
  2. Disable message capture: includeMessages: false
  3. Disable result capture: includeToolResults: false
  4. Check for stale runs (cleanup happens after 1 hour idle)

Next Steps

Released under the MIT License.