Name: Dba
Author: tma

name	dba
description	Multi-cluster MySQL/Vitess database health check specialist. Scans ALL clusters via dba-investigate.py orchestrator, classifies per-cluster health against 7-day baselines, runs causal analysis, and produces structured verdicts. USE WHEN investigating database issues, MySQL/Vitess health, replication lag, freno throttling, lock contention, query performance problems, or table traffic anomalies.
metadata	{"triggers":["dba","database","mysql","vitess","replication lag","freno","deadlock","lock contention","slow queries","proxysql","cluster health"],"provides":["database-health-check","mysql-analysis","structured-verdict"],"requires":["pup-cli","DD_API_KEY","DD_APP_KEY"]}

DBA Skill

This skill wraps the DBA agent. Full instructions live in the agent definition:

Agent file: $HOME/.pi/agent/agents/dba.md

Read that file and follow its execution procedure exactly.

Quick Summary

The DBA agent uses dba-investigate.py to:

Plan — generate Datadog + Kusto query plans for all clusters
Execute — run all planned queries via pup CLI and Kusto REST API
Analyze — classify per-cluster health (5-tier: normal → critical)
Deep-dive — symptom-driven diagnostic queries on problem clusters
Causal analysis — apply 25 causal rules to identify root causes
Report — markdown or JSON output

Orchestrator

$HOME/.pi/agent/skills/datadog/tools/dba-investigate.py

All metric definitions, classification thresholds, causal rules, and I/O logic live in the orchestrator. The agent runs it and executes the queries it generates.

Primary Command

Use the run subcommand — it chains all phases (plan → execute → collect-kusto → analyze → deep-dive-plan → execute deep-dive → causal analysis → report) in a single invocation:

python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py run \
  --start "$START" --end "$END" --id "$ID" \
  --service "$SERVICE" \
  --concurrency 6 \
  --format markdown

Subagent mode (orchestrated): Use --deep-dive-top 30 --format subagent when running as a subagent — 200 is too slow for multi-agent orchestration. Deep-dive queries are auto-capped to 200 in subagent mode.

Key flags:

--service — auto-resolves owned + dependency MySQL clusters via services-context
--clusters — comma-separated explicit cluster filter (overrides --service)
--deep-dive-top N — max clusters for deep-dive (default: 200, use 30 for subagent mode)
--max-deep-dive-queries N — cap total deep-dive queries (auto: 200 in subagent mode)
--concurrency N — parallel pup queries (default: 6, max safe: 6)
--all-tiers — include all tiers (default: tier 0+1 only)
--format — report format: markdown (default), json, subagent
--html — generate HTML report (separate flag)

Individual Phase Commands (Reference)

For debugging or partial re-runs, individual subcommands are available:

# Plan (use --service to scope to a service's MySQL clusters)
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py plan --start "$START" --end "$END" --id "$ID" [--service "$SERVICE"]

# Execute Datadog queries (handles all pup calls automatically)
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py execute --id "$ID"

# Execute Kusto queries
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py collect-kusto --id "$ID"

# Analyze (defaults to tier 0+1 clusters)
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py analyze --id "$ID"
# Override tier filter:
#   --tiers 0,1,2       # specific tiers
#   --all-tiers          # include ALL tiers
#   --clusters a,b       # specific clusters (skips tier filter)

# Deep-dive plan
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py deep-dive-plan --id "$ID"

# Execute deep-dive Datadog queries
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py execute --id "$ID" --plan deep-dive

# Execute deep-dive Kusto queries
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py collect-kusto --id "$ID" --deep-dive

# Verify completion (MANDATORY)
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py deep-dive-status --id "$ID"

# Causal analysis
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py analyze-deep-dive --id "$ID"

# Report
python3 $HOME/.pi/agent/skills/datadog/tools/dba-investigate.py report --id "$ID" --format markdown

Rules

Never ask for inputs — compute defaults and proceed immediately
Use run for full investigations — ALWAYS prefer dba-investigate.py run which chains all phases. Use individual subcommands only for partial re-runs or debugging. NEVER write custom scripts, shell loops, or inline code to run pup queries.
Read $HOME/.pi/agent/agents/dba.md for the complete execution procedure
Use the report skill for publishing — see $HOME/.pi/agent/skills/report/SKILL.md if available for Pages, Slack, Discussion targets
Plan posting — use src.lib.plan_engine for live Slack/GitHub progress tracking. Phase definitions in phase-steps.json.

Plan Posting (Slack)

Post live progress as a Slack plan block:

echo '{"phase": 3, "content": "Classifying per-cluster health..."}' | \
  python3 $HOME/.pi/agent/skills/report/tools/post-plan.py \
    --slack "$PERMALINK" --state-file /tmp/dba-state.json

name	dba
description	Multi-cluster MySQL/Vitess database health check specialist. Scans ALL clusters via dba-investigate.py orchestrator, classifies per-cluster health against 7-day baselines, runs causal analysis, and produces structured verdicts. USE WHEN investigating database issues, MySQL/Vitess health, replication lag, freno throttling, lock contention, query performance problems, or table traffic anomalies.
metadata	{"triggers":["dba","database","mysql","vitess","replication lag","freno","deadlock","lock contention","slow queries","proxysql","cluster health"],"provides":["database-health-check","mysql-analysis","structured-verdict"],"requires":["pup-cli","DD_API_KEY","DD_APP_KEY"]}