How it works
The SDK buffers events in-process and ships them to the Dunetrace ingest service in the background — a 200ms drain loop, batched at 100 events per flush.
your TS agent → POST /v1/ingest → detector → dashboard + Slack alerts
Prerequisites
- Dunetrace backend running (
docker compose up -d) - Node 18+ (built-in
fetchandAsyncLocalStoragerequired)
AUTH_MODE=dev. API keys are only required for production.1. Install
npm install dunetrace
Set environment variables:
DUNETRACE_ENDPOINT=http://localhost:8001 # ingest service (port 8001, not 8002)
DUNETRACE_API_KEY= # empty for local dev
2. Auto-instrument your LLM client (recommended)
Call dt.wrapOpenAI() or dt.wrapAnthropic() once and every LLM call inside a dt.run() context is tracked automatically. Use dt.tool() to wrap tool functions.
import { Dunetrace } from "dunetrace";
import OpenAI from "openai";
const dt = new Dunetrace();
const openai = dt.wrapOpenAI(new OpenAI()); // patch once at startup
const search = dt.tool(webSearch); // wrap tools once at startup
await dt.run("my-ts-agent", { model: "gpt-4o", tools: ["web_search"] }, async (run) => {
const response = await openai.chat.completions.create({ model: "gpt-4o", messages });
// ↑ llm.called + llm.responded emitted automatically (tokens, latency, finish_reason)
const results = await search(query);
// ↑ tool.called + tool.responded emitted automatically
run.finalAnswer();
});
await dt.shutdown();
Anthropic:
import Anthropic from "@anthropic-ai/sdk";
const anthropic = dt.wrapAnthropic(new Anthropic());
await dt.run("my-agent", { model: "claude-3-5-haiku-20241022" }, async (run) => {
const response = await anthropic.messages.create({ model: "claude-3-5-haiku-20241022", max_tokens: 1024, messages });
run.finalAnswer();
});
wrapOpenAI / wrapAnthropic skip calls with stream: true — use run.llmCalled / run.llmResponded manually for streamed calls.3. dt.tool() and dt.trace()
dt.tool(fn, name?) wraps any sync or async function to auto-emit tool.called / tool.responded. No-op when called outside a dt.run() context.
dt.trace(fn, agentId?, opts?) wraps an async function so it automatically opens and closes a run each time it is called. The first argument is used as userInput. Calls run.finalAnswer() on clean return.
const agent = dt.trace(myAgent, "my-agent", { model: "gpt-4o", tools: ["web_search"] });
const answer = await agent("What is the capital of France?");
await dt.shutdown();
4. getCurrentRun()
Returns the active DunetraceRun for the current async context (via AsyncLocalStorage), or null. Use inside helpers to access the run without prop drilling.
import { getCurrentRun } from "dunetrace";
async function dbQuery(sql: string) {
const run = getCurrentRun();
if (run) run.toolCalled("db_query", { sql });
const result = await db.query(sql);
if (run) run.toolResponded("db_query", true, result.length);
return result;
}
5. Manual tracking (full control)
Use run.llmCalled / run.llmResponded directly for fine-grained control:
import { randomUUID } from "crypto";
const ENDPOINT = process.env.DUNETRACE_ENDPOINT ?? "http://localhost:8001";
const API_KEY = process.env.DUNETRACE_API_KEY ?? "";
type EventType =
| "run.started" | "run.completed" | "run.errored"
| "llm.called" | "llm.responded"
| "tool.called" | "tool.responded"
| "retrieval.called" | "retrieval.responded"
| "external.signal";
interface AgentEvent {
event_type: EventType;
run_id: string;
agent_id: string;
agent_version: string;
step_index: number;
timestamp: number;
payload: Record<string, unknown>;
parent_run_id?: string | null;
}
export class DunetraceRun {
readonly runId: string = randomUUID();
private step = 0;
private events: AgentEvent[] = [];
constructor(private readonly agentId: string, private readonly version: string) {}
private emit(type: EventType, payload: Record<string, unknown>): void {
this.step++;
this.events.push({
event_type: type, run_id: this.runId, agent_id: this.agentId,
agent_version: this.version, step_index: this.step,
timestamp: Date.now() / 1000, payload,
});
}
llmCalled(model: string, promptTokens = 0): void {
this.emit("llm.called", { model, prompt_tokens: promptTokens });
}
llmResponded(opts: { completionTokens?: number; latencyMs?: number; finishReason?: string; outputLength?: number }): void {
this.emit("llm.responded", {
completion_tokens: opts.completionTokens ?? 0,
latency_ms: opts.latencyMs ?? 0,
finish_reason: opts.finishReason ?? "stop",
output_length: opts.outputLength ?? 0,
});
}
toolCalled(toolName: string, args: Record<string, unknown> = {}): void {
this.emit("tool.called", {
tool_name: toolName,
args_hash: Buffer.from(JSON.stringify(args)).toString("base64"),
});
}
toolResponded(toolName: string, success: boolean, outputLength = 0, latencyMs = 0, error?: string): void {
const payload: Record<string, unknown> = { tool_name: toolName, success, output_length: outputLength, latency_ms: latencyMs };
if (error) payload["error_hash"] = Buffer.from(error).toString("base64");
this.emit("tool.responded", payload);
}
retrievalCalled(indexName: string, queryHash = ""): void {
this.emit("retrieval.called", { index_name: indexName, query_hash: queryHash });
}
retrievalResponded(indexName: string, resultCount: number, topScore?: number, latencyMs = 0): void {
this.emit("retrieval.responded", { index_name: indexName, result_count: resultCount, top_score: topScore ?? null, latency_ms: latencyMs });
}
externalSignal(signalName: string, source = "", meta: Record<string, unknown> = {}): void {
this.step++;
this.events.push({
event_type: "external.signal", run_id: this.runId, agent_id: this.agentId,
agent_version: this.version, step_index: this.step,
timestamp: Date.now() / 1000,
payload: { signal_name: signalName, ...(source ? { source } : {}), ...meta },
});
}
finalAnswer(): void {
this.emit("run.completed", { exit_reason: "final_answer", total_steps: this.step });
}
getEvents(): AgentEvent[] { return this.events; }
}
export class Dunetrace {
async run(
agentId: string,
opts: { model?: string; tools?: string[] },
fn: (run: DunetraceRun) => Promise<void>,
): Promise<void> {
const version = opts.model ?? "unknown";
const run = new DunetraceRun(agentId, version);
const startEvent: AgentEvent = {
event_type: "run.started", run_id: run.runId, agent_id: agentId,
agent_version: version, step_index: 0, timestamp: Date.now() / 1000,
payload: { model: opts.model ?? "unknown", tools: opts.tools ?? [] },
};
try {
await fn(run);
} catch (err) {
await this._flush(agentId, [startEvent, ...run.getEvents(), {
event_type: "run.errored", run_id: run.runId, agent_id: agentId,
agent_version: version, step_index: run.getEvents().length + 1,
timestamp: Date.now() / 1000,
payload: { error_type: (err as Error).name ?? "Error" },
}]);
throw err;
}
await this._flush(agentId, [startEvent, ...run.getEvents()]);
}
private async _flush(agentId: string, events: AgentEvent[]): Promise<void> {
try {
await fetch(`${ENDPOINT}/v1/ingest`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ api_key: API_KEY, agent_id: agentId, events }),
});
} catch (err) {
console.warn("[dunetrace] Failed to flush events:", err);
}
}
}
Basic example
import { Dunetrace } from "./dunetrace";
const dt = new Dunetrace();
await dt.run("my-ts-agent", { model: "gpt-4o", tools: ["web_search"] }, async (run) => {
run.llmCalled("gpt-4o", 150);
const t0 = Date.now();
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: query }],
});
run.llmResponded({
completionTokens: response.usage?.completion_tokens,
latencyMs: Date.now() - t0,
finishReason: response.choices[0].finish_reason ?? "stop",
outputLength: response.choices[0].message.content?.length,
});
run.toolCalled("web_search", { query });
const t1 = Date.now();
const results = await webSearch(query);
run.toolResponded("web_search", true, results.length, Date.now() - t1);
run.finalAnswer();
});
RAG agent
await dt.run("rag-agent", { model: "gpt-4o" }, async (run) => {
run.llmCalled("gpt-4o", 200);
run.llmResponded({ finishReason: "tool_calls" });
run.retrievalCalled("product-docs");
const t0 = Date.now();
const docs = await vectorStore.search(query);
run.retrievalResponded("product-docs", docs.length, docs[0]?.score, Date.now() - t0);
run.llmCalled("gpt-4o", 600);
run.llmResponded({ finishReason: "stop", completionTokens: 120 });
run.finalAnswer();
});
Infrastructure signals
await dt.run("my-ts-agent", { model: "gpt-4o" }, async (run) => {
try {
run.toolCalled("external_api");
const result = await callExternalApi();
run.toolResponded("external_api", true, result.length);
} catch (err) {
if (isRateLimitError(err)) {
run.externalSignal("rate_limit", "external_api", { http_status: 429 });
}
run.toolResponded("external_api", false, 0, 0, String(err));
}
run.finalAnswer();
});
6. Verify the integration
Run your agent once, then check localhost:3000 — the run should appear within 15 seconds under my-ts-agent.
To confirm detectors fire, trigger a tool loop:
await dt.run("my-ts-agent", { model: "gpt-4o", tools: ["web_search"] }, async (run) => {
for (let i = 0; i < 5; i++) {
run.llmCalled("gpt-4o", 200 + i * 50);
run.llmResponded({ finishReason: "tool_calls" });
run.toolCalled("web_search", { query: "same query every time" });
run.toolResponded("web_search", true, 256);
}
run.finalAnswer();
});
This triggers TOOL_LOOP (same tool ≥3 times in a 5-call window). The signal should appear in the dashboard within ~15 seconds.
7. Deploy markers
Fire-and-forget deploy markers let the dashboard overlay release boundaries on detector rate charts.
dt.markDeploy("my-ts-agent", "v1.4.2", { commit: "abc123f", environment: "production" });
8. Manual client (no npm)
If you prefer not to use npm, a self-contained client can be copied into your project. The npm package is recommended — it adds background buffering, dt.tool(), dt.trace(), getCurrentRun(), and dt.markDeploy(). The manual client sends all events synchronously at run completion.
See the full manual client source in the GitHub docs.
RunContext API reference
| Method | When to call |
|---|---|
run.llmCalled(model, promptTokens?) | Before each LLM API call |
run.llmResponded({ completionTokens?, latencyMs?, finishReason?, outputLength? }) | After LLM responds |
run.toolCalled(toolName, args?) | Before each tool execution |
run.toolResponded(toolName, success, outputLength?, latencyMs?, error?) | After tool returns |
run.retrievalCalled(indexName, queryHash?) | Before vector search |
run.retrievalResponded(indexName, resultCount, topScore?, latencyMs?) | After retrieval returns |
run.externalSignal(signalName, source?, meta?) | Rate limits, cache misses, upstream errors |
run.finalAnswer() | When agent produces its final output |
What is and isn't captured
Transmitted (safe metadata only): model names, token counts, latencies, finish reasons, tool names, success/failure, output lengths, retrieval index names, result counts, top scores.
Never transmitted (hashed in-process): user input text, LLM prompts and completions, tool arguments and outputs (SHA-256 hashed; raw values never leave your process), error messages.
Troubleshooting
No runs appear in the dashboard
- Check
DUNETRACE_ENDPOINTpoints to the ingest service (port 8001, not 8002). - Confirm the backend is healthy:
curl http://localhost:8001/health - Check the Node console for
[dunetrace] Failed to flush eventswarnings.
Token counts missing
Pass completionTokens and promptTokens if your LLM client exposes them — they are optional but improve CONTEXT_BLOAT and LLM_TRUNCATION_LOOP detection accuracy.
Detectors fire too aggressively
Tune thresholds in detectors.yml on the server — see the detector reference.