Detectors — Dunetrace docs

Seventeen structural detectors run automatically on every completed run. All thresholds configurable. Shadow mode lets you evaluate new detectors against real traffic before they page anyone.

What each detector catches

Seventeen structural detectors run automatically on every completed run — 16 Tier 1 detectors plus prompt injection signal detection. All thresholds are configurable via detectors.yml — no code changes, no rebuild.

Detector	Trigger	Severity
`PROMPT_INJECTION_SIGNAL`	Input matches known injection / jailbreak patterns	CRIT
`TOOL_LOOP`	Same tool called ≥3× in a 5-tool-call window	HIGH
`TOOL_THRASHING`	Agent alternates between exactly two tools	HIGH
`LLM_TRUNCATION_LOOP`	`finish_reason=length` fires ≥2 times	HIGH
`RETRY_STORM`	Same tool fails 3+ times in a row without recovery	HIGH
`EMPTY_LLM_RESPONSE`	Zero-length output with `finish_reason=stop`	HIGH
`CASCADING_TOOL_FAILURE`	3+ consecutive failures across 2+ distinct tools	HIGH
`SLOW_STEP`	Step duration exceeds 2× P75 baseline ¹ or static fallback (tool >15s, LLM >30s)	MED/HIGH
`CONTEXT_BLOAT`	Prompt tokens grow beyond 2× P75 baseline ¹ or static fallback (3× from first to last call)	MED
`GOAL_ABANDONMENT`	Tool use stops, then ≥4 consecutive LLM calls with no exit	MED
`REASONING_STALL`	LLM:tool-call ratio exceeds 2× P75 baseline ¹ or static fallback (≥4×)	MED/HIGH
`STEP_COUNT_INFLATION`	Run used >2× P75 step count for this agent ¹	MED
`FIRST_STEP_FAILURE`	Error or empty output at step ≤2	MED
`RAG_EMPTY_RETRIEVAL`	Retrieval returned 0 results or relevance <0.3, agent answered anyway	MED
`TOOL_AVOIDANCE`	Final answer without calling available tools	MED
`COST_SPIKE`	Total token consumption exceeds 3× P75 baseline ¹ or static fallback (>50,000 tokens)	MED
`SESSION_LATENCY`	Total wall-clock run duration exceeds 3× P75 baseline ¹ or static fallback (>5 min)	MED

ℹ

¹ Six detectors use per-agent learned baselines: STEP_COUNT_INFLATION, SLOW_STEP, CONTEXT_BLOAT, REASONING_STALL, COST_SPIKE, and SESSION_LATENCY. P75 is computed from the last 50 successfully completed runs for the same agent_id + agent_version. Each detector falls back to its static threshold until at least 20 such runs exist, then switches to the adaptive baseline automatically. Tune the multiplier per agent with inflation_factor in detectors.yml.

Tuning thresholds

Edit detectors.yml in the repo root.

default:
  tool_loop:
    threshold: 2        # fire if same tool called ≥N times in window
  context_bloat:
    growth_factor: 4.0  # last/first prompt token ratio to trigger

web-research:
  tool_loop:
    threshold: 5        # search agents legitimately repeat queries

Named sections match agent_id and inherit from default, overriding only what you specify. Restart the detector to apply:

docker compose restart detector

Shadow mode

Every signal is stored with a shadow flag. The alerts worker only delivers signals where shadow = false.

All 17 built-in detectors ship live. Custom detectors start in shadow mode — signals stored and visible in the dashboard, but no Slack or webhook alert fires — until you add them to LIVE_DETECTORS:

# services/detector/detector_svc/db.py
LIVE_DETECTORS: set[str] = {
    "TOOL_LOOP",
    "YOUR_NEW_DETECTOR",   # promote once precision > 80%
    …
}

This lets you validate a new detector against real traffic before it pages anyone.

Shadow signals in the dashboard

The Alerts page surfaces shadow signals in a dedicated section below the live alert groups. Dashed border, reduced opacity, SHADOW badge. The section only appears when at least one shadow signal exists.

curl "http://localhost:8002/v1/agents/my-agent/signals?include_shadow=true" \
  -H "Authorization: Bearer dt_dev_test"

How detection works

Detector worker polls Postgres every 5 seconds for completed or stalled runs
Fetches all events for that run from events
Replays them into a RunState — tool calls, LLM calls, retrievals, durations
Runs all 16 Tier 1 detectors against the state (PROMPT_INJECTION_SIGNAL is extracted from the run.started payload — detected by the SDK on raw input before hashing)
Writes any triggered FailureSignal rows
Marks the run processed in processed_runs

Detection adds zero latency to the agent — it runs entirely after the run completes.