con un clic
find-traces
// Analyze OpenTelemetry distributed traces from Axiom. Use when investigating a trace ID, finding traces by criteria (errors, latency, service), or debugging distributed system issues.
// Analyze OpenTelemetry distributed traces from Axiom. Use when investigating a trace ID, finding traces by criteria (errors, latency, service), or debugging distributed system issues.
APL query language reference for Axiom. Provides operators, functions, patterns, and CLI usage. Auto-invoked by specialized Axiom skills when writing or debugging APL queries.
Detect anomalies in Axiom datasets using statistical analysis. Use when looking for unusual patterns, volume spikes, outliers, or new error types in observability data.
Explore an Axiom dataset to understand its schema, fields, volume, and patterns. Use when discovering a new dataset, investigating data structure, or understanding what data is available.
| name | find-traces |
| description | Analyze OpenTelemetry distributed traces from Axiom. Use when investigating a trace ID, finding traces by criteria (errors, latency, service), or debugging distributed system issues. |
| compatibility | Requires authenticated Axiom CLI (axiom) |
| user-invocable | true |
| context | fork |
| allowed-tools | Bash(axiom query *), Bash(axiom dataset list), Bash(axiom dataset list *), Bash(axiom config get *), Read, Grep, Glob |
Analyze OpenTelemetry distributed traces to identify errors, latency issues, and root causes.
When invoked with a trace ID (e.g., /find-traces abc123...), it's available as $ARGUMENTS.
First, find trace datasets:
axiom dataset list -f json
Look for datasets containing trace data (often named *traces*, *spans*, or otel-*).
Always verify field names first:
axiom query "['<trace-dataset>'] | getschema" --start-time -1h
axiom query "['<dataset>']
| where trace_id == '<TRACE_ID>'
| sort by _time asc
| limit 100" --start-time -1h -f json
axiom query "['<dataset>']
| where _time >= ago(1h)
| where error == true
| extend error = coalesce(ensure_field(\"error\", typeof(bool)), false)
| summarize
start_time = min(_time),
total_duration = max(duration),
span_count = count(),
error_count = countif(error),
services = make_set(['service.name']),
root_operation = arg_min(_time, name)
by trace_id
| sort by start_time desc
| limit 20" --start-time -1h -f json
axiom query "['<dataset>']
| where _time >= ago(1h)
| where duration >= 1000000000
| summarize
start_time = min(_time),
total_duration = max(duration),
span_count = count(),
services = make_set(['service.name'])
by trace_id
| sort by total_duration desc
| limit 20" --start-time -1h -f json
axiom query "['<dataset>']
| where _time >= ago(1h)
| where ['service.name'] == '<SERVICE>'
| summarize
start_time = min(_time),
total_duration = max(duration),
span_count = count(),
error_count = countif(error == true)
by trace_id
| sort by start_time desc
| limit 20" --start-time -1h -f json
axiom query "['<dataset>']
| where trace_id == '<TRACE_ID>'
| where error == true
| project _time, ['service.name'], name, duration, ['status.message']" --start-time -1h -f json
axiom query "['<dataset>']
| where trace_id == '<TRACE_ID>'
| project span_id, parent_span_id, ['service.name'], name, duration, error
| sort by duration desc" --start-time -1h -f json
| Field | Bracket? | Description |
|---|---|---|
trace_id | No | 32-char trace identifier |
span_id | No | 16-char span identifier |
parent_span_id | No | Parent span (empty for root) |
name | No | Operation name |
duration | No | Duration in nanoseconds |
kind | No | CLIENT, SERVER, INTERNAL, PRODUCER, CONSUMER |
error | No | Boolean error flag |
['service.name'] | Yes | Service identifier |
['status.code'] | Yes | OK, ERROR, or nil |
['status.message'] | Yes | Error description |
['scope.name'] | Yes | Instrumentation library |
OTel durations are in nanoseconds:
| Human | Nanoseconds | Filter |
|---|---|---|
| 1 ms | 1,000,000 | duration >= 1000000 |
| 100 ms | 100,000,000 | duration >= 100000000 |
| 1 s | 1,000,000,000 | duration >= 1000000000 |
Convert for display:
| extend duration_ms = duration / 1000000.0
Non-standard span attributes are stored in attributes.custom map:
// Filter by custom attribute
| where ['attributes.custom']['user_id'] == "123"
// Aggregation requires explicit cast
| summarize count() by tostring(['attributes.custom']['tenant'])
Without tostring(), aggregations fail with "grouping by field of type unknown".
When working in a repository that matches the traced service, correlate trace data with source code to identify root causes.
Extract package/module path from ['scope.name']
github.com/org/repo/pkg/auth → pkg/authFind code from operation name
name field often contains function names or HTTP routesTrace the call chain
Note: Codebase correlation is optional. Proceed with trace-only analysis if code is unavailable or doesn't match the traced services.
When analyzing a trace, provide:
## Trace Summary
- **Trace ID:** <id>
- **Duration:** <human-readable>
- **Services:** <list>
- **Outcome:** success/failure
## Sequence of Events
1. <Service> - <operation> (<duration>)
2. <Service> - <operation> (<duration>) ⚠️ ERROR
...
## Error Analysis
<What failed, when, why>
## Root Cause
<Deepest error and explanation>
## Codebase Locations (if applicable)
- **Service:** <service.name>
- **Package:** <scope.name>
- **Files:** <specific files to investigate>
## Recommended Actions
1. <Specific action>
2. <What to investigate next>
For query syntax, invoke the axiom-apl skill which provides trace analysis patterns and duration unit guidance.