Custom Spans

Wrap any function with traced() to capture duration, errors, and parent-child structure as a span.

ks.wrap() covers the LLM and tool-call path automatically. For the rest of your agent's work — file I/O, database queries, custom orchestration steps — use traced() to capture spans manually.

Spans nest correctly via async-context propagation. A traced() call inside another traced() call automatically links to its parent, building a tree you can inspect in the trace viewer.

Three forms

The traced() API supports three styles. Pick whichever reads cleanest in your code:

Decorator form (preferred)

Wrap a function once, reuse as a normal function:

import { traced } from "@polarityinc/polarity-keystone";
 
const writeConfig = traced(async (cfg: Config) => {
  await fs.writeFile("config.json", JSON.stringify(cfg));
  return "ok";
}, { name: "write_config" });
 
await writeConfig(myConfig);   // emits a span named "write_config"

Callback form (legacy)

Equivalent behavior, useful when you don't want to declare a function:

const result = await traced("write_config", async () => {
  await fs.writeFile("config.json", JSON.stringify(config));
  return "ok";
});

Context-manager form (Python only, manual lifecycle)

from polarity_keystone import traced
 
with traced(name="my-step") as span:
    result = do_work()
    span.set_output(result)

For multi-step spans where you decide the output asynchronously after the critical path completes.

TracedSpan (TypeScript, manual lifecycle)

import { TracedSpan } from "@polarityinc/polarity-keystone";
 
const span = new TracedSpan("my-step");
try {
  const result = doWork();
  span.setOutput(result);
} catch (err) {
  span.fail(err);
  throw err;
} finally {
  span.end();
}

Use when neither the decorator nor the callback form fits.

What gets reported

Every traced function emits two events: start and end.

{
  "ts": "2026-04-28T22:00:00.000Z",
  "event_type": "tool_call",
  "tool": "write_config",
  "phase": "start",
  "span_id": "span_abc",
  "parent_span_id": "span_xyz",     // empty if top-level
  "status": "ok"
}
 
{
  "ts": "2026-04-28T22:00:00.123Z",
  "event_type": "tool_call",
  "tool": "write_config",
  "phase": "end",
  "span_id": "span_abc",
  "parent_span_id": "span_xyz",
  "duration_ms": 123,
  "status": "ok",
  "output": "<JSON-serialized return value, truncated to ~4KB>"
}

On error:

{
  "phase": "end",
  "status": "error",
  "error_type": "tool_error",
  "output": "TypeError: Cannot read property 'foo' of undefined"
}

Parent-child propagation

Nested traced() calls auto-link via async-context (Node's AsyncLocalStorage, Python's threading.local):

const outer = traced(async () => {
  // outer span starts
  await innerOne();
  await innerTwo();
  // outer span ends
}, { name: "outer" });
 
const innerOne = traced(async () => {
  // automatically nested under "outer"
}, { name: "inner_one" });
 
const innerTwo = traced(async () => {
  // also nested under "outer"
}, { name: "inner_two" });
 
await outer();

Trace tree:

outer (250ms)
├── inner_one (80ms)
└── inner_two (160ms)

The dashboard renders this as a flame graph.

Combining wrap + traced

The most common pattern in real agents:

import { Keystone, traced } from "@polarityinc/polarity-keystone";
import Anthropic from "@anthropic-ai/sdk";
 
const ks = new Keystone();
const anthropic = ks.wrap(new Anthropic());
 
const writeFile = traced(async (path: string, content: string) => {
  await fs.writeFile(path, content);
}, { name: "write_file" });
 
const runShell = traced(async (cmd: string) => {
  return await exec(cmd);
}, { name: "run_shell" });
 
// Agent loop
async function runAgent(task: string) {
  const resp = await anthropic.messages.create({
    model: "claude-sonnet-4-5",
    max_tokens: 2048,
    messages: [{ role: "user", content: task }],
    tools: [/* write_file, run_shell */],
  });
  // → llm_call event with tokens, cost, latency
  // → one tool_use event per tool the model invoked
 
  for (const block of resp.content) {
    if (block.type === "tool_use") {
      if (block.name === "write_file") await writeFile(block.input.path, block.input.content);
      // → tool_call event with duration + output (parented under llm_call)
      if (block.name === "run_shell") await runShell(block.input.command);
    }
  }
}

Trace tree per turn:

llm_call (2.3s, $0.043)
├── tool_use:write_file (logged by wrap)
│   └── tool_call:write_file (12ms — actual exec, logged by traced)
└── tool_use:run_shell (logged by wrap)
    └── tool_call:run_shell (1.8s — actual exec, logged by traced)

The two-tier structure (tool_use from wrap + tool_call from traced) shows you both what the model asked for and what actually happened.

Output capture

The decorator form captures the function's return value as output (JSON-serialized, truncated to ~4KB). Use this to make spans self-describing:

const fetchUser = traced(async (id: string) => {
  const user = await db.query("SELECT * FROM users WHERE id = $1", [id]);
  return { id, name: user.name, plan: user.plan };
}, { name: "fetch_user" });
// output: {"id":"alice","name":"Alice","plan":"pro"}

For the callback form, return values are captured the same way. For TracedSpan / context-manager form, call setOutput() / set_output() explicitly.

Sandbox vs agent mode

Same routing as wrap():

Sandbox mode — KEYSTONE_SANDBOX_ID set → POST /v1/sandboxes/:id/trace
Agent mode — only KEYSTONE_API_KEY set → POST /v1/traces
Neither — no-op

initTracing() (or wrap(), which calls initTracing() internally) sets up the routing. After that, every traced() call uses whatever destination is current.

OpenTelemetry bridge

If you have an existing OTel tracer in your process, hand it to initTracing via otelTracer::

import { trace as otelTrace } from "@opentelemetry/api";
 
ks.initTracing(undefined, { otelTracer: otelTrace.getTracer("my-app") });
 
// Now every traced() span is also an OTel span with gen_ai.* attributes.

This means:

Spans appear in both Keystone and your OTel backend (Honeycomb, Tempo, Jaeger, etc.).
The OTel attributes use gen_ai.* semantic conventions so they render natively.
Use registerOtelFlush() to ensure OTel spans are flushed when Keystone tracing tears down.

Performance

traced() adds under 1ms overhead per call. Trace posting is fire-and-forget — failures don't bubble. Don't trace tight loops (for (let i = 0; i < 1_000_000; i++) traced(() => i*2)); but for any user-meaningful operation, the cost is negligible.

Patterns

Trace tool functions

const writeFile  = traced(async (path, content) => { ... }, { name: "write_file" });
const readFile   = traced(async (path) => { ... },          { name: "read_file" });
const runShell   = traced(async (cmd) => { ... },           { name: "run_shell" });
const httpRequest = traced(async (req) => { ... },          { name: "http_request" });

Wrap every tool function once, reuse everywhere. Now your trace tree is self-documenting.

Trace orchestration phases

async function runAgent(task: string) {
  return await traced(async () => {
    const plan = await traced(async () => makePlan(task), { name: "plan" });
    const result = await traced(async () => execute(plan), { name: "execute" });
    return await traced(async () => verify(result), { name: "verify" });
  }, { name: "agent.run" });
}

You see your three phases as siblings under agent.run, with their durations laid out in the dashboard.

Trace specific operations only

For agents where most work is already covered by wrap(), you might only trace heavy custom operations:

const ks = new Keystone();
const anthropic = ks.wrap(new Anthropic());           // covers LLM + tools
 
// Only trace the slow custom thing
const reranker = traced(async (docs, query) => {
  // expensive embedding similarity, not an LLM call
  return rerank(docs, query);
}, { name: "rerank" });

Manual span control

For long-running operations where you want to emit events at multiple checkpoints:

import { TracedSpan } from "@polarityinc/polarity-keystone";
 
const span = new TracedSpan("ingest-batch");
try {
  await loadFromS3();
  span.setOutput({ phase: "loaded" });    // not actually emitted yet — overwrites .output
 
  await validate();
  await transform();
  await writeToDB();
 
  span.setOutput({ phase: "complete", rows: 1234 });
} catch (err) {
  span.fail(err);
  throw err;
} finally {
  span.end();   // emits the end event with whatever was last set as output
}

The end event includes whatever setOutput was last called with.

LLM Tracing Production Mode