一键在 Manus 中运行任何 Skill

开始使用

subagent-testing

星标314

分支27

更新时间2026年6月6日 23:30

Test skills via TDD in fresh subagents. Use when validating behavior or preventing bias.

安装

用 Codex 或 Claude 帮你安装复制这段 Prompt，粘贴到 Codex、Claude 或其他助手里，让它检查 Skill 页面并帮你完成安装。

在 Manus 中运行

来源

athola

athola/claude-night-market

打开 GitHub 仓库查看创作者相关仓库

下载

在 Manus 中运行

Subagent Testing - TDD for Skills

Test skills with fresh subagent instances to prevent priming bias and validate effectiveness.

Overview
Why Fresh Instances Matter
Testing Methodology
Quick Start
Detailed Testing Guide
Success Criteria

Overview

Fresh instances prevent priming: Each test uses a new Claude conversation to verify the skill's impact is measured, not conversation history effects.

Why Fresh Instances Matter

The Priming Problem

Running tests in the same conversation creates bias:

Prior context influences responses
Skill effects get mixed with conversation history
Can't isolate skill's true impact

Fresh Instance Benefits

Isolation: Each test starts clean
Reproducibility: Consistent baseline state
Measurement: Clear before/after comparison
Validation: Proves skill effectiveness, not priming

Testing Methodology

Three-phase TDD-style approach:

Phase 1: Baseline Testing (RED)

Test without skill to establish baseline behavior.

Phase 2: With-Skill Testing (GREEN)

Test with skill loaded to measure improvements.

Phase 3: Rationalization Testing (REFACTOR)

Test skill's anti-rationalization guardrails.

Quick Start

# 1. Create baseline tests (without skill)
# Use 5 diverse scenarios
# Document full responses

# 2. Create with-skill tests (fresh instances)
# Load skill explicitly
# Use identical prompts
# Compare to baseline

# 3. Create rationalization tests
# Test anti-rationalization patterns
# Verify guardrails work

Detailed Testing Guide

For complete testing patterns, examples, and templates:

Testing Patterns - Full TDD methodology
Test Examples - Baseline, with-skill, rationalization tests
Analysis Templates - Scoring and comparison frameworks

Success Criteria

Baseline: Document 5+ diverse baseline scenarios
Improvement: ≥50% improvement in skill-related metrics
Consistency: Results reproducible across fresh instances
Rationalization Defense: Guardrails prevent ≥80% of rationalization attempts

name	subagent-testing
description	Test skills via TDD in fresh subagents. Use when validating behavior or preventing bias.
alwaysApply	false
category	testing
tags	["testing","validation","TDD","subagents","fresh-instances"]
token_budget	30
progressive_loading	true
modules	["modules/testing-patterns.md"]
model_hint	standard

Subagent Testing - TDD for Skills

Test skills with fresh subagent instances to prevent priming bias and validate effectiveness.

Overview
Why Fresh Instances Matter
Testing Methodology
Quick Start
Detailed Testing Guide
Success Criteria

Overview

Fresh instances prevent priming: Each test uses a new Claude conversation to verify the skill's impact is measured, not conversation history effects.

Why Fresh Instances Matter

The Priming Problem

Running tests in the same conversation creates bias:

Prior context influences responses
Skill effects get mixed with conversation history
Can't isolate skill's true impact

Fresh Instance Benefits

Isolation: Each test starts clean
Reproducibility: Consistent baseline state
Measurement: Clear before/after comparison
Validation: Proves skill effectiveness, not priming

Testing Methodology

Three-phase TDD-style approach:

Phase 1: Baseline Testing (RED)

Test without skill to establish baseline behavior.

Phase 2: With-Skill Testing (GREEN)

Test with skill loaded to measure improvements.

Phase 3: Rationalization Testing (REFACTOR)

Test skill's anti-rationalization guardrails.

Quick Start

# 1. Create baseline tests (without skill)
# Use 5 diverse scenarios
# Document full responses

# 2. Create with-skill tests (fresh instances)
# Load skill explicitly
# Use identical prompts
# Compare to baseline

# 3. Create rationalization tests
# Test anti-rationalization patterns
# Verify guardrails work

Detailed Testing Guide

For complete testing patterns, examples, and templates:

Testing Patterns - Full TDD methodology
Test Examples - Baseline, with-skill, rationalization tests
Analysis Templates - Scoring and comparison frameworks

Success Criteria

Baseline: Document 5+ diverse baseline scenarios
Improvement: ≥50% improvement in skill-related metrics
Consistency: Results reproducible across fresh instances
Rationalization Defense: Guardrails prevent ≥80% of rationalization attempts

subagent-testing

Subagent Testing - TDD for Skills

Table of Contents

Overview

Why Fresh Instances Matter

The Priming Problem

Fresh Instance Benefits

Testing Methodology

Phase 1: Baseline Testing (RED)

Phase 2: With-Skill Testing (GREEN)

Phase 3: Rationalization Testing (REFACTOR)

Quick Start

Detailed Testing Guide

Success Criteria

See Also

Subagent Testing - TDD for Skills

Table of Contents

Overview

Why Fresh Instances Matter

The Priming Problem

Fresh Instance Benefits

Testing Methodology

Phase 1: Baseline Testing (RED)

Phase 2: With-Skill Testing (GREEN)

Phase 3: Rationalization Testing (REFACTOR)

Quick Start

Detailed Testing Guide

Success Criteria

See Also

subagent-testing

Subagent Testing - TDD for Skills

Table of Contents

Overview

Why Fresh Instances Matter

The Priming Problem

Fresh Instance Benefits

Testing Methodology

Phase 1: Baseline Testing (RED)

Phase 2: With-Skill Testing (GREEN)

Phase 3: Rationalization Testing (REFACTOR)

Quick Start

Detailed Testing Guide

Success Criteria

See Also

同仓库更多 Skills

Subagent Testing - TDD for Skills

Table of Contents

Overview

Why Fresh Instances Matter

The Priming Problem

Fresh Instance Benefits

Testing Methodology

Phase 1: Baseline Testing (RED)

Phase 2: With-Skill Testing (GREEN)

Phase 3: Rationalization Testing (REFACTOR)

Quick Start

Detailed Testing Guide

Success Criteria

See Also

同仓库更多 Skills