Run any Skill in Manus with one click

$pwd:

splunk

Name: Splunk
Author: tma

// Query Splunk for GHES support bundle logs, Rails production logs, and security events. USE WHEN investigating GHES issues, searching Rails logs, or analyzing security audit trails.

Run Skill in Manus

$ git log --oneline --stat

stars:1

forks:0

updated:May 6, 2026 at 10:03

File Explorer

9 files

SKILL.md

readonly

package.json

"author": "tma"

"repository": "tma/dotfiles"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	splunk
description	Query Splunk for GHES support bundle logs, Rails production logs, and security events. USE WHEN investigating GHES issues, searching Rails logs, or analyzing security audit trails.
metadata	{"triggers":["splunk","ghes","support bundle","support-bundle","bundle_id","enterprise-bundles","esbtools","esb-tools","use splunk","rails logs","enterprise"],"provides":["log-search","bundle-analysis","request-trace"],"requires":["splunk-tools"],"auto_load":false}

Splunk Search

Query Splunk for GHES bundles, Rails logs, pod logs, and security events via Tailscale.

🔥 KEY USE CASES

1. Request ID Tracing (Cross-Index)

When you have a gh.request_id, trace it across ALL indexes:

index=* "gh.request_id"="XXXX:YYYY:ZZZZ:AAAA:TIMESTAMP" earliest=-1h
| sort _time
| table _time, index, kube_pod, code.namespace, code.function, Body, status

Why? A single request may touch multiple services (rails, internal-api, babeld, etc.) indexed separately.

2. Pod/Container Logs

Splunk indexes Kubernetes pod logs with rich metadata:

index=rails kube_pod="unicorn-*" earliest=-15m
| stats count by kube_pod, kube_cluster, status

Key fields for pod investigation:

kube_pod — Pod name
kube_container — Container name
kube_cluster — Cluster (e.g., dotcom-3-ash1-iad)
kube_namespace — Namespace (e.g., github-production)
kube_host — Node hostname

3. GHES Support Bundles

GHES support bundles are ingested via ESB-Tools into Splunk:

Index: prod-esbtools
Key fields: bundle_id, splunk_ingest_id
Example: index=prod-esbtools bundle_id=185712

Get bundle_id from ESB-Tools URL: enterprise-bundles.github.com/bundles/<bundle_id>

Chunked Output (Default)

By default, all searches write chunked output to /tmp/splunk-search/<hash>/ and print a compact manifest to stdout. Use --output for single-file mode.

# Default: chunked output (manifest to stdout)
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query "index=prod-esbtools bundle_id=185712"

# Output: /tmp/splunk-search/<hash>/manifest.json + chunk-NNN.json + errors.json

Use --output results.json for single-file mode (scripts, one-off use).

Processing Results

READ THE MANIFEST from stdout — it contains field names per sourcetype, error count + sample, and chunk time ranges
Read errors.json next if errors.count > 0 — pre-filtered, highest signal
⛔ Do NOT view chunk files — they are large and will flood context

⛔ Use Python to Search Chunks (Not `view`)

Chunk files can be 500+ lines each. Never cat or view them directly. Use Python one-liners to extract what you need:

# Find events matching a pattern across all chunks
python3 -c "
import json, glob
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
    data = json.load(open(f))
    for e in data['events']:
        if int(e.get('status', 0)) >= 500:
            print(f\"{e['_time']} {e.get('sourcetype')} {e.get('status')} {e.get('_raw','')[:120]}\")
"

# Count events by sourcetype across chunks
python3 -c "
import json, glob
from collections import Counter
c = Counter()
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
    for e in json.load(open(f))['events']:
        c[e.get('sourcetype', '?')] += 1
for st, n in c.most_common(): print(f'{n:>5} {st}')
"

# Extract events in a time window
python3 -c "
import json, glob
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
    data = json.load(open(f))
    for e in data['events']:
        if '2026-02-13T17:5' in e.get('_time', ''):
            print(json.dumps(e, indent=2))
"

What the Manifest Tells You (Without Reading Chunks)

Manifest Field	What It Answers
`fields.common`	What fields exist in every event
`fields.by_sourcetype.<st>.fields`	What fields are unique to a sourcetype
`fields.by_sourcetype.<st>.count`	How many events per sourcetype
`errors.count` + `errors.sample`	Are there errors? What do they look like?
`chunks.files[].earliest/latest`	Which chunk covers which time window
`pagination.total_available` vs `total_fetched`	Were results capped?

🚨 CRITICAL: SPLUNK TOOLS ARE AVAILABLE (NOT MCP)

Splunk uses Python scripts + environment variable, NOT MCP tools.

⛔ PROHIBITED Behaviors

❌ WRONG	✅ CORRECT
"Splunk MCP unavailable"	Check `$SPLUNK_TOKEN`, run scripts
"Skipping Splunk (no tool)"	Execute Python directly
"Requires manual access"	Run the command yourself

Verification Before Querying

# Step 1: Verify token exists
[ -n "$SPLUNK_TOKEN" ] && echo "✅ Token set" || echo "❌ Token missing"

# Step 2: Verify connectivity via Tailscale
tailscale ping splunkazure-api-azure-eastus.octoca.ts.net

# Step 3: Execute search (DO NOT SKIP)
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py --query "index=rails earliest=-15m | head 10"

Endpoints & Indexes

Default endpoint: azure-eastus via Tailscale.

Common indexes:

rails — Dotcom application logs
prod-esbtools — GHES support bundles
sec-prod-audit — Security audit logs

See workflows/reference.md for full index list and endpoints.

Available Tools

Tool	Purpose
`tools/splunk-search.py`	Execute SPL searches (including generating commands like `\| eventcount`)
`tools/splunk-indexes.py`	List accessible indexes (REST API + search-based discovery)

Note: splunk-indexes.py uses two methods: REST API (/services/data/indexes) and search-based (| eventcount). Some indexes (e.g., orca) are only visible via search. Indexes marked (search-only) in the output are accessible for queries but not listed in the REST metadata API.

Quick Search Examples

# Basic search
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query "index=rails status>=500 earliest=-1h | stats count by route"

# GHES bundle
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query 'index=prod-esbtools bundle_id=185712 | stats count by sourcetype'

# Generating commands (queries starting with |) — passed through without 'search' prefix
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query '| eventcount summarize=false index=* | stats sum(count) by index'

# Negative time values — use = syntax to avoid argparse misinterpreting as flags
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query 'index=orca | head 5' --earliest=-24h
# Alternative: quoted value also works
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query 'index=orca | head 5' -e '-24h'

⛔ Time Flag Gotcha

Use --earliest=-24h (= syntax) or -e '-24h' (quoted). Do NOT use -e -24h without quotes — argparse may misinterpret -24h as a flag.

Generating Commands

Queries starting with | (e.g., | eventcount, | tstats, | rest) are generating commands — they produce data without searching an index first. The tool auto-detects these and does NOT prepend search.

Investigation Patterns

See workflows/investigation-patterns.md for 8 common SPL patterns:

Request ID cross-service trace
Pod-level investigation
Error spike investigation
Slow request analysis
GHES bundle searches

Workflows

Workflow	Purpose
`/splunk:search`	Execute ad-hoc SPL query
`/splunk:request-trace`	Trace request ID across all indexes
`/splunk:pod-logs`	Search pod/container logs
`/splunk:ghes-bundle`	Search GHES support bundle logs
`/splunk:security-audit`	Search security audit events

GHES Support Bundle Investigation

GHES bundles are in index prod-esbtools. Key fields: bundle_id, host, sourcetype.

index=prod-esbtools bundle_id=<ID> | stats count by sourcetype, host

See workflows/ghes-bundle.md and workflows/investigation-patterns.md for detailed patterns.

Troubleshooting

"Unauthorized" Error

Verify token: [ -n "$SPLUNK_TOKEN" ] && echo "Set" || echo "Missing"
Token may be expired - check exp claim in JWT
Verify entitlement access

Network Timeout

Check Tailscale: tailscale status
Ping endpoint: tailscale ping splunkazure-api-azure-eastus.octoca.ts.net
Verify internal-actions tag or splunk-api entitlement

No Results

Check index name spelling
Verify time range (earliest=-1h not last 1 hour)
Try broader search first, then narrow

⛔ Progressive Time Window (CRITICAL)

NEVER start with broad searches. Start narrow, expand only if needed:

Step	Window	When to expand
1	`-15m`	No results or need more context
2	`-1h`	Still insufficient
3	`-4h`	Expand if pattern requires
4	`-24h`	Only for baseline comparison

❌ PROHIBITED Patterns

# NEVER DO THIS - causes timeout, quota issues
index=* earliest=-30d error
index=rails | head 1000000
index=* status>=500 | stats count

✅ CORRECT Patterns

# Specific index + short window + filters
index=rails kube_pod="actions-*" status>=500 earliest=-15m
| stats count by route, status

# Expand only when needed
index=rails service.name="actions-runner-admin" earliest=-1h
| timechart span=5m count by status

When `index=*` IS Acceptable

Only for cross-service request tracing with a SPECIFIC identifier:

# OK - specific request ID, limited window
index=* "gh.request_id"="XXXX:YYYY:ZZZZ:AAAA:TS" earliest=-1h latest=+1h
| sort _time | table _time, index, kube_pod, status

# OK - specific trace ID, limited window  
index=* TraceId="abc123def456" earliest=-1h
| sort _time

name	splunk
description	Query Splunk for GHES support bundle logs, Rails production logs, and security events. USE WHEN investigating GHES issues, searching Rails logs, or analyzing security audit trails.
metadata	{"triggers":["splunk","ghes","support bundle","support-bundle","bundle_id","enterprise-bundles","esbtools","esb-tools","use splunk","rails logs","enterprise"],"provides":["log-search","bundle-analysis","request-trace"],"requires":["splunk-tools"],"auto_load":false}

Splunk Search

Query Splunk for GHES bundles, Rails logs, pod logs, and security events via Tailscale.

🔥 KEY USE CASES

1. Request ID Tracing (Cross-Index)

When you have a gh.request_id, trace it across ALL indexes:

index=* "gh.request_id"="XXXX:YYYY:ZZZZ:AAAA:TIMESTAMP" earliest=-1h
| sort _time
| table _time, index, kube_pod, code.namespace, code.function, Body, status

Why? A single request may touch multiple services (rails, internal-api, babeld, etc.) indexed separately.

2. Pod/Container Logs

Splunk indexes Kubernetes pod logs with rich metadata:

index=rails kube_pod="unicorn-*" earliest=-15m
| stats count by kube_pod, kube_cluster, status

Key fields for pod investigation:

kube_pod — Pod name
kube_container — Container name
kube_cluster — Cluster (e.g., dotcom-3-ash1-iad)
kube_namespace — Namespace (e.g., github-production)
kube_host — Node hostname

3. GHES Support Bundles

GHES support bundles are ingested via ESB-Tools into Splunk:

Index: prod-esbtools
Key fields: bundle_id, splunk_ingest_id
Example: index=prod-esbtools bundle_id=185712

Get bundle_id from ESB-Tools URL: enterprise-bundles.github.com/bundles/<bundle_id>

Chunked Output (Default)

By default, all searches write chunked output to /tmp/splunk-search/<hash>/ and print a compact manifest to stdout. Use --output for single-file mode.

# Default: chunked output (manifest to stdout)
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query "index=prod-esbtools bundle_id=185712"

# Output: /tmp/splunk-search/<hash>/manifest.json + chunk-NNN.json + errors.json

Use --output results.json for single-file mode (scripts, one-off use).

Processing Results

READ THE MANIFEST from stdout — it contains field names per sourcetype, error count + sample, and chunk time ranges
Read errors.json next if errors.count > 0 — pre-filtered, highest signal
⛔ Do NOT view chunk files — they are large and will flood context

⛔ Use Python to Search Chunks (Not `view`)

Chunk files can be 500+ lines each. Never cat or view them directly. Use Python one-liners to extract what you need:

# Find events matching a pattern across all chunks
python3 -c "
import json, glob
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
    data = json.load(open(f))
    for e in data['events']:
        if int(e.get('status', 0)) >= 500:
            print(f\"{e['_time']} {e.get('sourcetype')} {e.get('status')} {e.get('_raw','')[:120]}\")
"

# Count events by sourcetype across chunks
python3 -c "
import json, glob
from collections import Counter
c = Counter()
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
    for e in json.load(open(f))['events']:
        c[e.get('sourcetype', '?')] += 1
for st, n in c.most_common(): print(f'{n:>5} {st}')
"

# Extract events in a time window
python3 -c "
import json, glob
for f in sorted(glob.glob('/tmp/splunk-search/<hash>/chunk-*.json')):
    data = json.load(open(f))
    for e in data['events']:
        if '2026-02-13T17:5' in e.get('_time', ''):
            print(json.dumps(e, indent=2))
"

What the Manifest Tells You (Without Reading Chunks)

Manifest Field	What It Answers
`fields.common`	What fields exist in every event
`fields.by_sourcetype.<st>.fields`	What fields are unique to a sourcetype
`fields.by_sourcetype.<st>.count`	How many events per sourcetype
`errors.count` + `errors.sample`	Are there errors? What do they look like?
`chunks.files[].earliest/latest`	Which chunk covers which time window
`pagination.total_available` vs `total_fetched`	Were results capped?

🚨 CRITICAL: SPLUNK TOOLS ARE AVAILABLE (NOT MCP)

Splunk uses Python scripts + environment variable, NOT MCP tools.

⛔ PROHIBITED Behaviors

❌ WRONG	✅ CORRECT
"Splunk MCP unavailable"	Check `$SPLUNK_TOKEN`, run scripts
"Skipping Splunk (no tool)"	Execute Python directly
"Requires manual access"	Run the command yourself

Verification Before Querying

# Step 1: Verify token exists
[ -n "$SPLUNK_TOKEN" ] && echo "✅ Token set" || echo "❌ Token missing"

# Step 2: Verify connectivity via Tailscale
tailscale ping splunkazure-api-azure-eastus.octoca.ts.net

# Step 3: Execute search (DO NOT SKIP)
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py --query "index=rails earliest=-15m | head 10"

Endpoints & Indexes

Default endpoint: azure-eastus via Tailscale.

Common indexes:

rails — Dotcom application logs
prod-esbtools — GHES support bundles
sec-prod-audit — Security audit logs

See workflows/reference.md for full index list and endpoints.

Available Tools

Tool	Purpose
`tools/splunk-search.py`	Execute SPL searches (including generating commands like `\| eventcount`)
`tools/splunk-indexes.py`	List accessible indexes (REST API + search-based discovery)

Note: splunk-indexes.py uses two methods: REST API (/services/data/indexes) and search-based (| eventcount). Some indexes (e.g., orca) are only visible via search. Indexes marked (search-only) in the output are accessible for queries but not listed in the REST metadata API.

Quick Search Examples

# Basic search
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query "index=rails status>=500 earliest=-1h | stats count by route"

# GHES bundle
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query 'index=prod-esbtools bundle_id=185712 | stats count by sourcetype'

# Generating commands (queries starting with |) — passed through without 'search' prefix
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query '| eventcount summarize=false index=* | stats sum(count) by index'

# Negative time values — use = syntax to avoid argparse misinterpreting as flags
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query 'index=orca | head 5' --earliest=-24h
# Alternative: quoted value also works
python $HOME/.pi/agent/skills/splunk/tools/splunk-search.py \
  --query 'index=orca | head 5' -e '-24h'

⛔ Time Flag Gotcha

Use --earliest=-24h (= syntax) or -e '-24h' (quoted). Do NOT use -e -24h without quotes — argparse may misinterpret -24h as a flag.

Generating Commands

Investigation Patterns

See workflows/investigation-patterns.md for 8 common SPL patterns:

Request ID cross-service trace
Pod-level investigation
Error spike investigation
Slow request analysis
GHES bundle searches

Workflows

Workflow	Purpose
`/splunk:search`	Execute ad-hoc SPL query
`/splunk:request-trace`	Trace request ID across all indexes
`/splunk:pod-logs`	Search pod/container logs
`/splunk:ghes-bundle`	Search GHES support bundle logs
`/splunk:security-audit`	Search security audit events

GHES Support Bundle Investigation

GHES bundles are in index prod-esbtools. Key fields: bundle_id, host, sourcetype.

index=prod-esbtools bundle_id=<ID> | stats count by sourcetype, host

See workflows/ghes-bundle.md and workflows/investigation-patterns.md for detailed patterns.

Troubleshooting

"Unauthorized" Error

Verify token: [ -n "$SPLUNK_TOKEN" ] && echo "Set" || echo "Missing"
Token may be expired - check exp claim in JWT
Verify entitlement access

Network Timeout

Check Tailscale: tailscale status
Ping endpoint: tailscale ping splunkazure-api-azure-eastus.octoca.ts.net
Verify internal-actions tag or splunk-api entitlement

No Results

Check index name spelling
Verify time range (earliest=-1h not last 1 hour)
Try broader search first, then narrow

⛔ Progressive Time Window (CRITICAL)

NEVER start with broad searches. Start narrow, expand only if needed:

Step	Window	When to expand
1	`-15m`	No results or need more context
2	`-1h`	Still insufficient
3	`-4h`	Expand if pattern requires
4	`-24h`	Only for baseline comparison

❌ PROHIBITED Patterns

# NEVER DO THIS - causes timeout, quota issues
index=* earliest=-30d error
index=rails | head 1000000
index=* status>=500 | stats count

✅ CORRECT Patterns

# Specific index + short window + filters
index=rails kube_pod="actions-*" status>=500 earliest=-15m
| stats count by route, status

# Expand only when needed
index=rails service.name="actions-runner-admin" earliest=-1h
| timechart span=5m count by status

When `index=*` IS Acceptable

Only for cross-service request tracing with a SPECIFIC identifier:

# OK - specific request ID, limited window
index=* "gh.request_id"="XXXX:YYYY:ZZZZ:AAAA:TS" earliest=-1h latest=+1h
| sort _time | table _time, index, kube_pod, status

# OK - specific trace ID, limited window  
index=* TraceId="abc123def456" earliest=-1h
| sort _time

splunk

Splunk Search

🔥 KEY USE CASES

1. Request ID Tracing (Cross-Index)

2. Pod/Container Logs

3. GHES Support Bundles

Chunked Output (Default)

Processing Results

⛔ Use Python to Search Chunks (Not view)

What the Manifest Tells You (Without Reading Chunks)

🚨 CRITICAL: SPLUNK TOOLS ARE AVAILABLE (NOT MCP)

⛔ PROHIBITED Behaviors

Verification Before Querying

Endpoints & Indexes

Available Tools

Quick Search Examples

⛔ Time Flag Gotcha

Generating Commands

Investigation Patterns

Workflows

GHES Support Bundle Investigation

Troubleshooting

"Unauthorized" Error

Network Timeout

No Results

⛔ Progressive Time Window (CRITICAL)

❌ PROHIBITED Patterns

✅ CORRECT Patterns

When index=* IS Acceptable

Splunk Search

🔥 KEY USE CASES

1. Request ID Tracing (Cross-Index)

2. Pod/Container Logs

3. GHES Support Bundles

Chunked Output (Default)

Processing Results

⛔ Use Python to Search Chunks (Not view)

What the Manifest Tells You (Without Reading Chunks)

🚨 CRITICAL: SPLUNK TOOLS ARE AVAILABLE (NOT MCP)

⛔ PROHIBITED Behaviors

Verification Before Querying

Endpoints & Indexes

Available Tools

Quick Search Examples

⛔ Time Flag Gotcha

Generating Commands

Investigation Patterns

Workflows

GHES Support Bundle Investigation

Troubleshooting

"Unauthorized" Error

Network Timeout

No Results

⛔ Progressive Time Window (CRITICAL)

❌ PROHIBITED Patterns

✅ CORRECT Patterns

When index=* IS Acceptable

⛔ Use Python to Search Chunks (Not `view`)

When `index=*` IS Acceptable

⛔ Use Python to Search Chunks (Not `view`)

When `index=*` IS Acceptable