| name | subagent-testing |
| description | Test skills via RED/GREEN/REFACTOR TDD in fresh subagents. Use when validating skill behavior or preventing priming bias. |
| alwaysApply | false |
| category | testing |
| tags | ["testing","validation","TDD","subagents","fresh-instances"] |
| token_budget | 30 |
| progressive_loading | true |
| modules | ["modules/testing-patterns.md"] |
| model_hint | standard |
Subagent Testing - TDD for Skills
Test skills with fresh subagent instances to prevent priming bias and validate effectiveness.
Table of Contents
- Overview
- Why Fresh Instances Matter
- Testing Methodology
- Quick Start
- Detailed Testing Guide
- Success Criteria
Overview
Fresh instances prevent priming: Each test uses a new Claude conversation to verify
the skill's impact is measured, not conversation history effects.
Why Fresh Instances Matter
The Priming Problem
Running tests in the same conversation creates bias:
- Prior context influences responses
- Skill effects get mixed with conversation history
- Can't isolate skill's true impact
Fresh Instance Benefits
- Isolation: Each test starts clean
- Reproducibility: Consistent baseline state
- Measurement: Clear before/after comparison
- Validation: Proves skill effectiveness, not priming
Testing Methodology
Three-phase TDD-style approach:
Phase 1: Baseline Testing (RED)
Test without skill to establish baseline behavior.
Phase 2: With-Skill Testing (GREEN)
Test with skill loaded to measure improvements.
Phase 3: Rationalization Testing (REFACTOR)
Test skill's anti-rationalization guardrails.
Quick Start
Detailed Testing Guide
For complete testing patterns, examples, and templates:
Success Criteria
- Baseline: Document 5+ diverse baseline scenarios
- Improvement: ≥50% improvement in skill-related metrics
- Consistency: Results reproducible across fresh instances
- Rationalization Defense: Guardrails prevent ≥80% of rationalization attempts
See Also
- skill-authoring: Creating effective skills
- bulletproof-skill: Anti-rationalization patterns
- test-skill: Automated skill testing command