with one click
splunk
// Query Splunk for GHES support bundle logs, Rails production logs, and security events. USE WHEN investigating GHES issues, searching Rails logs, or analyzing security audit trails.
// Query Splunk for GHES support bundle logs, Rails production logs, and security events. USE WHEN investigating GHES issues, searching Rails logs, or analyzing security audit trails.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | splunk |
| description | Query Splunk for GHES support bundle logs, Rails production logs, and security events. USE WHEN investigating GHES issues, searching Rails logs, or analyzing security audit trails. |
| metadata | {"triggers":["splunk","ghes","support bundle","support-bundle","bundle_id","enterprise-bundles","esbtools","esb-tools","use splunk","rails logs","enterprise"],"provides":["log-search","bundle-analysis","request-trace"],"requires":["splunk-tools"],"auto_load":false} |
Query Splunk for GHES bundles, Rails logs, pod logs, and security events via Tailscale.
When you have a gh.request_id, trace it across ALL indexes:
index=* "gh.request_id"="XXXX:YYYY:ZZZZ:AAAA:TIMESTAMP" earliest=-1h
| sort _time
| table _time, index, kube_pod, code.namespace, code.function, Body, status
Why? A single request may touch multiple services (rails, internal-api, babeld, etc.) indexed separately.
Splunk indexes Kubernetes pod logs with rich metadata:
index=rails kube_pod="unicorn-*" earliest=-15m
| stats count by kube_pod, kube_cluster, status
Key fields for pod investigation:
kube_pod — Pod namekube_container — Container namekube_cluster — Cluster (e.g., dotcom-3-ash1-iad)kube_namespace — Namespace (e.g., github-production)kube_host — Node hostnameGHES support bundles are ingested via ESB-Tools into Splunk:
prod-esbtoolsbundle_id, splunk_ingest_idindex=prod-esbtools bundle_id=185712Get bundle_id from ESB-Tools URL: enterprise-bundles.github.com/bundles/<bundle_id>
By default, all searches write chunked output to /tmp/splunk-search/<hash>/ and print a compact manifest to stdout. Use --output for single-file mode.
# Default: chunked output (manifest to stdout)
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
--query "index=prod-esbtools bundle_id=185712"
# Output: /tmp/splunk-search/<hash>/manifest.json + chunk-NNN.json + errors.json
Use --output results.json for single-file mode (scripts, one-off use).
errors.json next if errors.count > 0 — pre-filtered, highest signalview chunk files — they are large and will flood contextview)Chunk files can be 500+ lines each. Never cat or view them directly. Use Python one-liners to extract what you need:
# Find events matching a pattern across all chunks
python3 -c "
import json, glob
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
data = json.load(open(f))
for e in data['events']:
if int(e.get('status', 0)) >= 500:
print(f\"{e['_time']} {e.get('sourcetype')} {e.get('status')} {e.get('_raw','')[:120]}\")
"
# Count events by sourcetype across chunks
python3 -c "
import json, glob
from collections import Counter
c = Counter()
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
for e in json.load(open(f))['events']:
c[e.get('sourcetype', '?')] += 1
for st, n in c.most_common(): print(f'{n:>5} {st}')
"
# Extract events in a time window
python3 -c "
import json, glob
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
data = json.load(open(f))
for e in data['events']:
if '2026-02-13T17:5' in e.get('_time', ''):
print(json.dumps(e, indent=2))
"
| Manifest Field | What It Answers |
|---|---|
fields.common | What fields exist in every event |
fields.by_sourcetype.<st>.fields | What fields are unique to a sourcetype |
fields.by_sourcetype.<st>.count | How many events per sourcetype |
errors.count + errors.sample | Are there errors? What do they look like? |
chunks.files[].earliest/latest | Which chunk covers which time window |
pagination.total_available vs total_fetched | Were results capped? |
Splunk uses Python scripts + environment variable, NOT MCP tools.
| ❌ WRONG | ✅ CORRECT |
|---|---|
| "Splunk MCP unavailable" | Check $SPLUNK_TOKEN, run scripts |
| "Skipping Splunk (no tool)" | Execute Python directly |
| "Requires manual access" | Run the command yourself |
# Step 1: Verify token exists
[ -n "$SPLUNK_TOKEN" ] && echo "✅ Token set" || echo "❌ Token missing"
# Step 2: Verify connectivity via Tailscale
tailscale ping splunkazure-api-azure-eastus.octoca.ts.net
# Step 3: Execute search (DO NOT SKIP)
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py --query "index=rails earliest=-15m | head 10"
Default endpoint: azure-eastus via Tailscale.
Common indexes:
rails — Dotcom application logsprod-esbtools — GHES support bundlessec-prod-audit — Security audit logsSee workflows/reference.md for full index list and endpoints.
| Tool | Purpose |
|---|---|
tools/splunk-search.py | Execute SPL searches (including generating commands like | eventcount) |
tools/splunk-indexes.py | List accessible indexes (REST API + search-based discovery) |
Note:
splunk-indexes.pyuses two methods: REST API (/services/data/indexes) and search-based (| eventcount). Some indexes (e.g.,orca) are only visible via search. Indexes marked(search-only)in the output are accessible for queries but not listed in the REST metadata API.
# Basic search
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
--query "index=rails status>=500 earliest=-1h | stats count by route"
# GHES bundle
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
--query 'index=prod-esbtools bundle_id=185712 | stats count by sourcetype'
# Generating commands (queries starting with |) — passed through without 'search' prefix
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
--query '| eventcount summarize=false index=* | stats sum(count) by index'
# Negative time values — use = syntax to avoid argparse misinterpreting as flags
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
--query 'index=orca | head 5' --earliest=-24h
# Alternative: quoted value also works
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
--query 'index=orca | head 5' -e '-24h'
Use --earliest=-24h (= syntax) or -e '-24h' (quoted). Do NOT use -e -24h without quotes — argparse may misinterpret -24h as a flag.
Queries starting with | (e.g., | eventcount, | tstats, | rest) are generating commands — they produce data without searching an index first. The tool auto-detects these and does NOT prepend search.
See workflows/investigation-patterns.md for 8 common SPL patterns:
| Workflow | Purpose |
|---|---|
/splunk:search | Execute ad-hoc SPL query |
/splunk:request-trace | Trace request ID across all indexes |
/splunk:pod-logs | Search pod/container logs |
/splunk:ghes-bundle | Search GHES support bundle logs |
/splunk:security-audit | Search security audit events |
GHES bundles are in index prod-esbtools. Key fields: bundle_id, host, sourcetype.
index=prod-esbtools bundle_id=<ID> | stats count by sourcetype, host
See workflows/ghes-bundle.md and workflows/investigation-patterns.md for detailed patterns.
[ -n "$SPLUNK_TOKEN" ] && echo "Set" || echo "Missing"exp claim in JWTtailscale statustailscale ping splunkazure-api-azure-eastus.octoca.ts.netearliest=-1h not last 1 hour)NEVER start with broad searches. Start narrow, expand only if needed:
| Step | Window | When to expand |
|---|---|---|
| 1 | -15m | No results or need more context |
| 2 | -1h | Still insufficient |
| 3 | -4h | Expand if pattern requires |
| 4 | -24h | Only for baseline comparison |
# NEVER DO THIS - causes timeout, quota issues
index=* earliest=-30d error
index=rails | head 1000000
index=* status>=500 | stats count
# Specific index + short window + filters
index=rails kube_pod="actions-*" status>=500 earliest=-15m
| stats count by route, status
# Expand only when needed
index=rails service.name="actions-runner-admin" earliest=-1h
| timechart span=5m count by status
index=* IS AcceptableOnly for cross-service request tracing with a SPECIFIC identifier:
# OK - specific request ID, limited window
index=* "gh.request_id"="XXXX:YYYY:ZZZZ:AAAA:TS" earliest=-1h latest=+1h
| sort _time | table _time, index, kube_pod, status
# OK - specific trace ID, limited window
index=* TraceId="abc123def456" earliest=-1h
| sort _time