Tracing & Observability

Auto-Instrument

One call patches every importable LLM stack — OpenAI, Anthropic, Mistral, LangChain, Vercel AI SDK, LiteLLM, DSPy, and more.

auto_instrument() is the ergonomic shortcut for "I have multiple LLM providers and frameworks installed and I just want everything traced." One call detects what's importable in your process and patches each one.

It's what ks.wrap() calls under the hood when you pass aiSdk: or langchainCallbackManager:. You can also call it directly.

Why use it

Real applications mix providers:

  • An agent that uses Claude via the Anthropic SDK.
  • A summarization step that uses OpenAI via Vercel's AI SDK.
  • A tool-routing layer built on LangChain.

Wrapping each one individually requires eight imports and eight wrap() calls. auto_instrument() does it in one.

Calling

import { autoInstrument } from "@polarityinc/polarity-keystone";
 
const applied = autoInstrument({
  sandboxId: process.env.KEYSTONE_SANDBOX_ID,
  // Pass the modules you have imported — each is patched if found:
  aiSdk: aiSdk,                                    // import * as aiSdk from "ai"
  langchainCallbackManager: lc.callbackManager,
});
// → ['openai', 'anthropic', 'ai-sdk.generateText', 'ai-sdk.streamText', 'langchain']

The function returns the labels of every layer it patched — useful for logging at startup.

What gets patched

TypeScript

The TS SDK auto-detects:

LibraryMethod patched
OpenAI (openai)chat.completions.create() (sync + streaming)
Anthropic (@anthropic-ai/sdk)messages.create() (sync + streaming)
Vercel AI SDK (ai)generateText, streamText, generateObject, streamObject
LangChain.jsWraps a CallbackManager instance to emit Keystone spans
Claude Agent SDK (@anthropic-ai/claude-agent-sdk)The agent's tool-call lifecycle hooks
Mistral (@mistralai/mistralai)chat() and stream()
Google GenAI (@google/genai)generateContent()

For modules that need explicit handles (Vercel AI SDK, LangChain), pass them in via aiSdk: / langchainCallbackManager:. The function can't import them on your behalf without making them required peer dependencies.

Python

The Python SDK is more aggressive — it imports modules itself if they're installed:

ModulePatched if importable
openaiyes
anthropicyes
mistralaiyes
google.genaiyes
litellmyes
langchainyes (callback handler installed)
claude_agent_sdkyes
dspyyes (via callback)

Each provider is guarded by try/except — missing modules are skipped silently.

Idempotent

auto_instrument() is safe to call multiple times. Internally tracks which modules have been patched and skips re-patching:

auto_instrument(sandbox_id="sb-abc")     # patches everything
auto_instrument(sandbox_id="sb-abc")     # no-op, already patched

Sandbox id required

Unlike wrap() (which falls back to agent mode when no sandbox is set), auto_instrument() requires sandboxId — either as an arg or as KEYSTONE_SANDBOX_ID in the env. Without it, the call raises:

RuntimeError: auto_instrument requires sandbox_id (arg) or KEYSTONE_SANDBOX_ID env var

This is intentional. auto_instrument is a sandbox-context tool — for prod observability without a sandbox, use wrap() directly on each client (which routes to /v1/traces automatically).

Combining with wrap()

The "I just want everything" entry point:

import * as aiSdk from "ai";
import { Keystone } from "@polarityinc/polarity-keystone";
import OpenAI from "openai";
import Anthropic from "@anthropic-ai/sdk";
 
const ks = new Keystone();
 
ks.wrap(new OpenAI(), {
  aiSdk,                                       // also patches AI SDK
  langchainCallbackManager: lc.callbackManager,
});
 
ks.wrap(new Anthropic());                       // also wrapped explicitly

When wrap() runs with a sandbox id and you pass framework modules, it calls autoInstrument() for those modules in addition to wrapping the explicit client.

Or use observe() for the most ergonomic shape

ks.observe({
  clients: [new Anthropic(), new OpenAI()],
  aiSdk,
  langchainCallbackManager: lc.callbackManager,
});
// Returns: ['anthropic-client', 'openai-client', 'tracing', 'ai-sdk.generateText', 'langchain']

Wraps every named client, initializes tracing, auto-instruments everything else, all in one call. Returns the labels of what was instrumented.

This is the recommended entry point for new code: the "trace everything" line.

Per-framework details

Vercel AI SDK

import * as aiSdk from "ai";
ks.observe({ aiSdk });
 
// Now this is traced:
const result = await aiSdk.generateText({
  model: openai("gpt-4o"),
  prompt: "...",
  tools: { /* ... */ },
});

The patch hooks generateText, streamText, generateObject, streamObject and emits the same shape as direct OpenAI/Anthropic wrap.

LangChain.js

import { CallbackManager } from "@langchain/core/callbacks/manager";
import { ChatOpenAI } from "@langchain/openai";
 
const callbackManager = new CallbackManager();
ks.observe({ langchainCallbackManager: callbackManager });
 
const llm = new ChatOpenAI({ callbackManager });
await llm.invoke([{ role: "user", content: "..." }]);
// → llm_call event posted to Keystone

The Keystone callback handler is installed onto the manager. Any chain or agent that uses the manager picks it up automatically.

LangChain (Python)

from langchain_openai import ChatOpenAI
from polarity_keystone import auto_instrument
 
auto_instrument(sandbox_id="sb-abc")    # installs the global LC callback handler
 
llm = ChatOpenAI(model="gpt-4o")
llm.invoke([{"role": "user", "content": "..."}])
# → llm_call event posted

Python's auto-instrument uses LangChain's global add_global_callback — every chain in the process is traced.

LiteLLM (Python)

import litellm
from polarity_keystone import auto_instrument
 
auto_instrument(sandbox_id="sb-abc")
 
litellm.completion(model="gpt-4o", messages=[...])
# → llm_call event posted

LiteLLM's callback hooks are wired up. Works for both completion and acompletion.

DSPy (Python)

import dspy
from polarity_keystone import auto_instrument
 
auto_instrument(sandbox_id="sb-abc")
 
# DSPy modules' forward calls now emit spans

Patterns

Application bootstrap

// app.ts — runs once at process startup
import * as aiSdk from "ai";
import { Keystone } from "@polarityinc/polarity-keystone";
 
export const ks = new Keystone();
ks.observe({ aiSdk });
// Now every LLM call in the process is traced.

Conditional in CI vs prod

const ks = new Keystone();
 
if (process.env.KEYSTONE_SANDBOX_ID) {
  // Inside an eval — use auto-instrument for full coverage
  ks.observe({ aiSdk, langchainCallbackManager });
} else {
  // Production — only wrap the specific clients you care about
  ks.wrap(new OpenAI());
  ks.wrap(new Anthropic());
}

In sandbox mode you have complete control; in agent mode you typically wrap explicitly.

Audit what's instrumented

const labels = ks.observe({ /* ... */ });
console.log("[keystone] instrumented:", labels);
// → instrumented: ['openai-client', 'anthropic-client', 'tracing', 'ai-sdk.generateText']

Log at startup so you know what's covered.

Limitations

  • Cannot wrap clients constructed before auto_instrument runs. If you new OpenAI() and then patch, that instance isn't wrapped. Wrap explicit clients with ks.wrap() to be safe.
  • Patches are per-process. A new Node worker thread has its own copy; call again there.
  • Doesn't patch HTTP-level interceptors. If a framework uses a custom HTTP transport that bypasses the SDK, the wrap may miss it. The Go SDK's transport-level wrapping is the workaround for that.