with one click
detect-anomalies
// Detect anomalies in Axiom datasets using statistical analysis. Use when looking for unusual patterns, volume spikes, outliers, or new error types in observability data.
// Detect anomalies in Axiom datasets using statistical analysis. Use when looking for unusual patterns, volume spikes, outliers, or new error types in observability data.
APL query language reference for Axiom. Provides operators, functions, patterns, and CLI usage. Auto-invoked by specialized Axiom skills when writing or debugging APL queries.
Explore an Axiom dataset to understand its schema, fields, volume, and patterns. Use when discovering a new dataset, investigating data structure, or understanding what data is available.
Analyze OpenTelemetry distributed traces from Axiom. Use when investigating a trace ID, finding traces by criteria (errors, latency, service), or debugging distributed system issues.
| name | detect-anomalies |
| description | Detect anomalies in Axiom datasets using statistical analysis. Use when looking for unusual patterns, volume spikes, outliers, or new error types in observability data. |
| compatibility | Requires authenticated Axiom CLI (axiom) |
| user-invocable | true |
| context | fork |
| allowed-tools | Bash(axiom query *), Bash(axiom dataset list), Bash(axiom dataset list *), Bash(axiom config get *), Read, Grep, Glob |
Detect anomalies in Axiom datasets by comparing recent patterns to historical baselines using statistical analysis.
When invoked with a dataset name (e.g., /detect-anomalies logs), it's available as $ARGUMENTS.
Statistical anomaly detection requires sufficient data:
If these aren't met, results may be misleading. Consider using simpler threshold-based alerting instead.
Always verify field names first:
axiom query "['<dataset>'] | getschema" --start-time -1h
Compare recent volume to baseline:
Calculate baseline (past 24h excluding last hour):
axiom query "['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize count() by bin(_time, 1h)
| summarize
avg_hourly = avg(count_),
stdev_hourly = stdev(count_)" --start-time -25h -f json
Check recent volume:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize
current_count = count(),
current_hour = min(_time)" --start-time -1h -f json
Z-score calculation:
z_score = (current - avg) / stdev|z_score| > 2 indicates anomalyFind values that appeared recently but weren't seen before:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize by error_code
| join kind=leftanti (
['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize by error_code
) on error_code" --start-time -25h -f json
Replace error_code with any categorical field (service, endpoint, status).
Find values outside normal distribution:
Calculate bounds:
axiom query "['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize
avg_val = avg(duration),
stdev_val = stdev(duration)
| extend
lower_bound = avg_val - 3 * stdev_val,
upper_bound = avg_val + 3 * stdev_val" --start-time -25h -f json
Find outliers:
axiom query "['<dataset>']
| where _time >= ago(1h)
| where duration < <lower_bound> or duration > <upper_bound>
| limit 100" --start-time -1h -f json
Find infrequent occurrences:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize count() by error_message
| where count_ == 1" --start-time -1h -f json
Compare error rate to baseline:
axiom query "['<dataset>']
| where _time >= ago(6h)
| summarize
total = count(),
errors = countif(status >= 500)
by bin(_time, 15m)
| extend error_rate = errors * 100.0 / total
| sort by _time asc" --start-time -6h -f json
Track percentile changes:
axiom query "['<dataset>']
| where _time >= ago(6h)
| summarize
p50 = percentile(duration, 50),
p95 = percentile(duration, 95),
p99 = percentile(duration, 99)
by bin(_time, 15m)
| sort by _time asc" --start-time -6h -f json
| Type | Detection Method | Indicates |
|---|---|---|
| Volume Spike | Z-score on count | Traffic surge, attack, incident |
| Volume Drop | Z-score on count | Outage, data collection issue |
| New Values | Left anti-join | New errors, new services |
| Statistical Outlier | 3-sigma rule | Extreme performance issue |
| Rare Events | Count = 1 | Unusual conditions |
| Error Spike | Error rate increase | Service degradation |
| Latency Spike | Percentile increase | Performance issue |
## Anomaly Report: <dataset>
### Summary
- Analysis period: <timeframe>
- Anomalies found: <count>
### Volume Anomalies
| Time | Count | Expected | Z-Score |
|------|-------|----------|---------|
| ... | ... | ... | ... |
### New Values
- Field: `error_code`
- New values: `TIMEOUT_ERROR`, `CONNECTION_REFUSED`
### Statistical Outliers
- Field: `duration`
- Outliers: <count> events above <threshold>
### Error Rate
- Baseline: X%
- Current: Y%
- Change: +Z%
### Recommendations
1. <Investigation action>
2. <Monitoring suggestion>
For query syntax, invoke the axiom-apl skill which provides anomaly detection patterns and function documentation.