with one click
chaos-testing
// Controlled failure injection: hypothesis design, blast radius control, safety mechanisms, game day planning, and resilience verification.
// Controlled failure injection: hypothesis design, blast radius control, safety mechanisms, game day planning, and resilience verification.
| name | chaos-testing |
| description | Controlled failure injection: hypothesis design, blast radius control, safety mechanisms, game day planning, and resilience verification. |
Controlled failure injection to build confidence in system resilience.
Identify measurable indicators of normal system behavior:
"When [failure condition], the system will [expected behavior] because [mechanism]."
Example: "When database primary fails, the system will failover to replica within 30s because of automatic failover configuration."
| Element | Description |
|---|---|
| Target | Which component to perturb |
| Failure mode | What kind of failure (latency, crash, partition) |
| Blast radius | Scope of impact (single instance, AZ, region) |
| Duration | How long the failure persists |
| Abort criteria | When to immediately stop the experiment |
| Rollback plan | How to restore normal operation |
| Type | Examples |
|---|---|
| Process | Kill process, OOM, CPU spike |
| Network | Latency injection, packet loss, partition |
| Infrastructure | Instance termination, AZ failure, disk full |
| Application | Exception injection, slow dependency, config error |
| Data | Corrupt cache, stale data, schema mismatch |
Token-efficient communication protocol. Activate ONLY when: (1) user explicitly requests it (e.g., "use omni", "be concise", "compress output"), (2) dispatched as a sub-agent in /workflow-team pipelines where token budget matters, or (3) agent-to-agent communication via /omni headless modifier. Never activate by default in normal conversations — users expect natural language responses unless they opt in. Compresses prose form while preserving 100% technical accuracy. Code blocks, tool calls, file paths, and data are NEVER compressed.
Structured code review protocol for inspecting code quality against the full rule set. Use when auditing code written by yourself or another agent, during the /audit workflow, or when the user asks for a code review.
Pre-flight checklist and post-implementation self-review protocol. Use before generating any code (pre-flight) and after writing code but before verification (self-review) to catch issues early.
Comprehensive protocol for validating root causes of software issues. Use when you need to systematically debug a complex bug, flaky test, or unknown system behavior by forming hypotheses and validating them with specific tasks.
Profile-driven performance optimization protocol. Use when profiling data (CPU, heap, trace) is available or when the user requests performance analysis. Covers methodology, pattern catalog, safety invariants, and when-to-stop heuristics. Language-specific tooling is in languages/*.md.
Architecture Decision Record skill for documenting significant architectural decisions with context, options, and consequences. Use during the Research phase when choosing between approaches, or whenever the user asks to document an architectural decision.