mit einem Klick
detect-anomalies
// Detect anomalies in Axiom datasets using statistical analysis. Use when looking for unusual patterns, volume spikes, outliers, or new error types in observability data.
// Detect anomalies in Axiom datasets using statistical analysis. Use when looking for unusual patterns, volume spikes, outliers, or new error types in observability data.
| name | detect-anomalies |
| description | Detect anomalies in Axiom datasets using statistical analysis. Use when looking for unusual patterns, volume spikes, outliers, or new error types in observability data. |
| compatibility | Requires authenticated Axiom CLI (axiom) |
| user-invocable | true |
| context | fork |
| allowed-tools | Bash(axiom query *), Bash(axiom dataset list), Bash(axiom dataset list *), Bash(axiom config get *), Read, Grep, Glob |
Detect anomalies in Axiom datasets by comparing recent patterns to historical baselines using statistical analysis.
When invoked with a dataset name (e.g., /detect-anomalies logs), it's available as $ARGUMENTS.
Statistical anomaly detection requires sufficient data:
If these aren't met, results may be misleading. Consider using simpler threshold-based alerting instead.
Always verify field names first:
axiom query "['<dataset>'] | getschema" --start-time -1h
Compare recent volume to baseline:
Calculate baseline (past 24h excluding last hour):
axiom query "['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize count() by bin(_time, 1h)
| summarize
avg_hourly = avg(count_),
stdev_hourly = stdev(count_)" --start-time -25h -f json
Check recent volume:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize
current_count = count(),
current_hour = min(_time)" --start-time -1h -f json
Z-score calculation:
z_score = (current - avg) / stdev|z_score| > 2 indicates anomalyFind values that appeared recently but weren't seen before:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize by error_code
| join kind=leftanti (
['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize by error_code
) on error_code" --start-time -25h -f json
Replace error_code with any categorical field (service, endpoint, status).
Find values outside normal distribution:
Calculate bounds:
axiom query "['<dataset>']
| where _time between (ago(25h) .. ago(1h))
| summarize
avg_val = avg(duration),
stdev_val = stdev(duration)
| extend
lower_bound = avg_val - 3 * stdev_val,
upper_bound = avg_val + 3 * stdev_val" --start-time -25h -f json
Find outliers:
axiom query "['<dataset>']
| where _time >= ago(1h)
| where duration < <lower_bound> or duration > <upper_bound>
| limit 100" --start-time -1h -f json
Find infrequent occurrences:
axiom query "['<dataset>']
| where _time >= ago(1h)
| summarize count() by error_message
| where count_ == 1" --start-time -1h -f json
Compare error rate to baseline:
axiom query "['<dataset>']
| where _time >= ago(6h)
| summarize
total = count(),
errors = countif(status >= 500)
by bin(_time, 15m)
| extend error_rate = errors * 100.0 / total
| sort by _time asc" --start-time -6h -f json
Track percentile changes:
axiom query "['<dataset>']
| where _time >= ago(6h)
| summarize
p50 = percentile(duration, 50),
p95 = percentile(duration, 95),
p99 = percentile(duration, 99)
by bin(_time, 15m)
| sort by _time asc" --start-time -6h -f json
| Type | Detection Method | Indicates |
|---|---|---|
| Volume Spike | Z-score on count | Traffic surge, attack, incident |
| Volume Drop | Z-score on count | Outage, data collection issue |
| New Values | Left anti-join | New errors, new services |
| Statistical Outlier | 3-sigma rule | Extreme performance issue |
| Rare Events | Count = 1 | Unusual conditions |
| Error Spike | Error rate increase | Service degradation |
| Latency Spike | Percentile increase | Performance issue |
## Anomaly Report: <dataset>
### Summary
- Analysis period: <timeframe>
- Anomalies found: <count>
### Volume Anomalies
| Time | Count | Expected | Z-Score |
|------|-------|----------|---------|
| ... | ... | ... | ... |
### New Values
- Field: `error_code`
- New values: `TIMEOUT_ERROR`, `CONNECTION_REFUSED`
### Statistical Outliers
- Field: `duration`
- Outliers: <count> events above <threshold>
### Error Rate
- Baseline: X%
- Current: Y%
- Change: +Z%
### Recommendations
1. <Investigation action>
2. <Monitoring suggestion>
For query syntax, invoke the axiom-apl skill which provides anomaly detection patterns and function documentation.
APL query language reference for Axiom. Provides operators, functions, patterns, and CLI usage. Auto-invoked by specialized Axiom skills when writing or debugging APL queries.
Explore an Axiom dataset to understand its schema, fields, volume, and patterns. Use when discovering a new dataset, investigating data structure, or understanding what data is available.
Analyze OpenTelemetry distributed traces from Axiom. Use when investigating a trace ID, finding traces by criteria (errors, latency, service), or debugging distributed system issues.