with one click
debug
// Systematic debugging via logs, health checks, hypothesis-driven investigation. Triggers: debug, error, trace root cause, fix bug, reproduce symptom, investigation.
// Systematic debugging via logs, health checks, hypothesis-driven investigation. Triggers: debug, error, trace root cause, fix bug, reproduce symptom, investigation.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | debug |
| description | Systematic debugging via logs, health checks, hypothesis-driven investigation. Triggers: debug, error, trace root cause, fix bug, reproduce symptom, investigation. |
| user-invocable | true |
| effort | medium |
| argument-hint | [symptom] |
| agent | debugger |
| context | fork |
| allowed-tools | Bash, Read, Grep |
$ARGUMENTS
Systematic debugging for application issues.
docker compose logs --tail 20 2>/dev/null || tail -20 logs/*.log 2>/dev/null || echo "no-logs-found"Pipe error output through the error parser for structured diagnosis:
# Pipe from failing command
your_command 2>&1 | python3 "$(dirname "$0")/scripts/error-parser.py"
# Or from a log file
cat /var/log/app/error.log | python3 scripts/error-parser.py
The script outputs JSON with:
Use the parsed output to focus investigation on the right files and hypotheses.
NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
Random fixes waste time and create new bugs. Quick patches mask underlying issues. Complete each phase before proceeding to the next.
Read error messages and stack traces completely. Reproduce reliably (or gather more data — don't guess). Check recent changes (git diff, new deps, config). For multi-component systems: log boundary in/out at each layer, identify WHERE it breaks before WHY.
Find similar working code in the same codebase. Compare against references completely, not skimming. List every difference, however small.
Form a single hypothesis ("X is the root cause because Y"). Test minimally — smallest possible change, one variable at a time. Verify before continuing — if it didn't work, form a NEW hypothesis. Don't stack fixes on top of fixes.
Write a failing test case FIRST (use /tdd). Implement single fix at root cause. No "while I'm here" improvements.
Ask "Why?" at least 5 times to find the real issue. Stop at the first plausible answer = symptom fixing. Example: crash → null pointer → user object null → API 404 → invalid user ID → frontend allowed negative IDs (root cause).
If three hypotheses failed and each fix reveals new shared state in different places, the architecture is wrong, not your hypothesis. STOP. Discuss with user before more attempts.
# Application logs (auto-detect environment)
# Docker:
docker compose logs --tail 100 {service} 2>&1 | grep -i error
# Bare metal / systemd:
journalctl -u {service} --since "1 hour ago" | grep -i error
# Log files:
tail -100 logs/app.log | grep -i error
# Docker environment
docker compose ps
# Process check
ps aux | grep -E "(node|python|java|php)" | grep -v grep
# HTTP health endpoints
curl -sf http://localhost:{port}/health
# Python
python3 -c "import module; print(module.function('test'))"
# Node.js
node -e "const m = require('./module'); console.log(m.fn('test'))"
# PHP
php -r "require 'vendor/autoload.php'; echo MyClass::method('test');"
# PostgreSQL
psql -U postgres -c "SELECT version();"
# MySQL
mysql -e "SELECT VERSION();"
# Redis
redis-cli ping && redis-cli info memory
# MongoDB
mongosh --eval "db.runCommand({ping:1})"
# Check server logs for stack traces
grep -A5 "Traceback\|Error\|Exception" logs/app.log
# Resource usage
top -bn1 | head -20 # CPU/memory
iostat -x 1 3 # Disk I/O
ss -tlnp # Open connections
# Test connectivity
curl -I http://localhost:{port}
nc -zv {host} {port}
For complex bugs (open >1h, unclear root cause), spawn teammates to investigate competing hypotheses:
Create an agent team to debug this issue:
- Teammate 1 (debugger): "Investigate if [bug] is caused by [hypothesis A: database issue].
Check logs, connection pools, timeouts, query performance."
Use Opus.
- Teammate 2 (debugger): "Investigate if [bug] is caused by [hypothesis B: race condition].
Look for async issues, locking, concurrency, shared state."
Use Opus.
- Teammate 3 (debugger): "Investigate if [bug] is caused by [hypothesis C: configuration drift].
Compare env vars, config files, recent changes, dependency versions."
Use Opus.
Have them talk to each other to challenge each other's theories.
Report consensus when done.
| Excuse | Why It's Wrong |
|---|---|
| "It works on my machine" | Environment differences are the #1 cause of production bugs — reproduce in prod-like env |
| "It must be a library bug" | 95% of the time it's your code — exhaust local hypotheses first |
| "I'll just add more logging and wait" | Passive debugging wastes hours — form a hypothesis and test it actively |
| "The error message says X, so it must be X" | Error messages often describe symptoms, not root causes — trace the full chain |
| "It only happens sometimes, probably a fluke" | Intermittent bugs are race conditions or state leaks — they get worse, not better |
docker compose logs with no --since shows logs from the current container lifecycle plus anything buffered. After a restart you may read stale logs that look like the current error. Always filter: docker compose logs --since 5m.tail -f stops emitting after a log rotation unless you pass -F (GNU) or --follow=name — the file descriptor points at the renamed inode. On rotated logs, always use -F.curl -f <url> exits non-zero on 4xx/5xx but discards the response body — you lose the exact error. Debug with curl -s -o /tmp/body -w 'HTTP %{http_code}\n' and then inspect /tmp/body./triage-issue instead/fix/performance-profiling or /analyze --type=complexity/workflow incident-response (coordinated response)/review to verify the fix quality/tdd to write it test-first/analyze --type=complexity for hotspot analysis/workflow incident-response for full response