원클릭으로
subagent-testing
Test skills via TDD in fresh subagents. Use when validating behavior or preventing bias.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
메뉴
Test skills via TDD in fresh subagents. Use when validating behavior or preventing bias.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
SOC 직업 분류 기준
Detects AI-generated writing patterns in prose. Use when reviewing docs for slop, vague language, or identity leaks before publishing.
Audits Rust code for unsafe blocks, ownership issues, and Cargo dependency risks. Use when reviewing Rust code or before merging Rust changes.
Recommends context compression strategies for bloated or quota-heavy sessions. Use when context feels sluggish or quota burns faster than expected.
Guide minimal code via a decision ladder with full safety, edge, and negative-case coverage. Use when adding code, choosing a dependency, or auditing a diff.
Optimizes context window via MECW principles and memory tiering. Use when context exceeds 30% or before long multi-step tasks.
Generates or remediates documentation with human-quality writing. Use when creating new docs, rewriting AI-generated content, or applying style profiles.
| name | subagent-testing |
| description | Test skills via TDD in fresh subagents. Use when validating behavior or preventing bias. |
| alwaysApply | false |
| category | testing |
| tags | ["testing","validation","TDD","subagents","fresh-instances"] |
| token_budget | 30 |
| progressive_loading | true |
| modules | ["modules/testing-patterns.md"] |
| model_hint | standard |
Test skills with fresh subagent instances to prevent priming bias and validate effectiveness.
Fresh instances prevent priming: Each test uses a new Claude conversation to verify the skill's impact is measured, not conversation history effects.
Running tests in the same conversation creates bias:
Three-phase TDD-style approach:
Test without skill to establish baseline behavior.
Test with skill loaded to measure improvements.
Test skill's anti-rationalization guardrails.
# 1. Create baseline tests (without skill)
# Use 5 diverse scenarios
# Document full responses
# 2. Create with-skill tests (fresh instances)
# Load skill explicitly
# Use identical prompts
# Compare to baseline
# 3. Create rationalization tests
# Test anti-rationalization patterns
# Verify guardrails work
For complete testing patterns, examples, and templates: