mit einem Klick
incident-response
// Structured incident workflow: severity classification, triage, diagnosis, mitigation, postmortem, and prevention. Template-driven with blameless review.
// Structured incident workflow: severity classification, triage, diagnosis, mitigation, postmortem, and prevention. Template-driven with blameless review.
Session bootstrap + workflows for Pathfinder semantic navigation tools. Covers: discovery protocol, tool chaining patterns (explore, impact, audit, debug), search optimization, LSP degraded mode, and error recovery.
Playwright browser automation via MCP. Covers E2E testing, UI review, web scraping, screenshot capture, and general browser interaction. MCP-first — CLI is fallback only.
Safe command execution: input sanitization, timeout handling, output capture, error propagation. For spawning processes, shell commands, system calls.
Git conventions: conventional commits, branch naming, PR hygiene, release tagging.
Constructs, validates, and traverses a Directed Acyclic Graph (DAG) from scope cards for safe level-based parallel dispatch. Determines execution order via topological sort. Detects cycles and invalid dependencies.
Decomposes broad tasks into MECE, parallelizable sub-tasks with explicit scope cards. Core skill for intra-domain parallel dispatch. Produces scope cards consumed by parallel-dispatch-dag, parallel-dispatch-ownership, and parallel-dispatch-merge skills.
| name | incident-response |
| description | Structured incident workflow: severity classification, triage, diagnosis, mitigation, postmortem, and prevention. Template-driven with blameless review. |
Structured framework for handling production incidents with blameless postmortems.
| Severity | Impact | Response Time | Examples |
|---|---|---|---|
| P0 | Complete outage, data loss risk | Immediate (< 15 min) | Service down, data corruption |
| P1 | Major degradation, many users affected | < 30 min | Core feature broken, severe performance |
| P2 | Partial degradation, some users | < 2 hours | Non-critical feature broken, slow queries |
| P3 | Minor issue, workaround available | < 1 business day | UI glitch, minor performance |
debugging-protocol skill)# Incident Postmortem: {title}
Date: {date}
Severity: P{0-3}
Duration: {start} → {resolved}
Author: {name}
## Summary
{1-2 sentence impact summary}
## Timeline
| Time | Event |
|---|---|
| HH:MM | {event description} |
## Root Cause
{description with evidence}
## Contributing Factors
- {factor with context}
## What Went Well
- {positive observation}
## What Could Be Improved
- {improvement area}
## Action Items
| Action | Owner | Due Date | Status |
|---|---|---|---|
| {specific action} | @{person} | YYYY-MM-DD | Open |
Proactive failure analysis: imagine the feature has already failed, then work backward to identify why.
The feature has launched and failed catastrophically. Work backward:
For each component in the DESIGN output, enumerate:
Each specialist brings a unique failure lens:
| Agent | Risk Domain | Example Findings |
|---|---|---|
| incident-responder | Operational failures, cascade risks, recovery gaps | "No rollback path for schema migration" |
| security-engineer | Threat model, attack vectors, auth bypass | "JWT validation missing on webhook endpoint" |
| performance-engineer | Scalability limits, resource exhaustion | "Unbounded query on user list with no pagination" |
| database-expert | Data integrity, migration reversibility, concurrency | "Non-idempotent migration with no down script" |
Include agents based on feature risk profile:
# Pre-Mortem: {feature name}
Date: {date}
Analysts: {agent list}
## Risk Assessment Summary
| Risk | Severity | Likelihood | Mitigation |
|------|----------|-----------|------------|
| {failure scenario} | Critical/High/Medium/Low | High/Medium/Low | {recommendation} |
## Failure Modes
### {failure scenario title}
- **Trigger**: {what causes this failure}
- **Blast Radius**: {what else breaks}
- **Detection**: {how would we know — monitoring, alerts, logs}
- **Recovery**: {rollback path, data recovery, feature flags}
- **Mitigation**: {what BUILD agents should do to prevent this}
## Monitoring Gaps
- {gap}: {recommendation for devops-engineer}
## Design Revision Recommendations
- {if any DESIGN changes are needed before BUILD proceeds}