Run any Skill in Manus with one click

$pwd:

harness

Name: Harness
Author: junnv93

// 3-Agent harness orchestrator (Planner → Generator → Evaluator loop). Auto-select execution mode based on task complexity. Reuse existing verify-*/review-* skills as Evaluator infrastructure. Trigger on "하네스", "/harness", "harness mode", or when starting non-trivial multi-file implementation tasks.

Run Skill in Manus

$ git log --oneline --stat

stars:0

forks:0

updated:May 6, 2026 at 04:12

File Explorer

10 files

SKILL.md

readonly

package.json

"author": "junnv93"

"repository": "junnv93/equipment_management_system"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	harness
description	3-Agent harness orchestrator (Planner → Generator → Evaluator loop). Auto-select execution mode based on task complexity. Reuse existing verify-/review- skills as Evaluator infrastructure. Trigger on "하네스", "/harness", "harness mode", or when starting non-trivial multi-file implementation tasks.
argument-hint	[요청 내용] 또는 [mode0\|mode1\|mode2\|load-bearing\|entropy]

Harness — Generator-Evaluator Orchestrator

OpenAI "Harness Engineering" 패턴 적용. 기존 verify-, review- 스킬을 Evaluator로 재사용하며, 자동 반복 루프로 품질 보장.

Core Principles

Generate and evaluate separately — self-evaluation bias is universal; always run Evaluator as independent Agent
Evaluate against contract, not intuition — contract.md의 명시적 MUST/SHOULD 기준으로만 PASS/FAIL 판정
Be a skeptical evaluator — identify legitimate issues, then do NOT talk yourself into approving them. If it fails a criterion, it fails. Period.
Constrain deliverables, not implementation — Planner는 "무엇을" 결정, Generator는 "어떻게" 결정. 구현 세부사항을 과잉 명세하면 cascading error 발생
Reuse existing infrastructure — 새 검증 로직 작성 금지. verify-, review- 스킬을 orchestrate
Repository is the record system — 실행 계획, 완료된 작업, 기술 부채는 모두 .claude/exec-plans/에 버전 관리. 외부 채팅/문서에 있는 컨텍스트는 에이전트가 볼 수 없음
Repair cost < waiting cost — SHOULD 기준 실패는 루프 차단 없이 후속 PR로 처리
Advisor 패턴: 전략은 비싸게, 실행은 싸게 — Planner(전략적 판단)는 model: "opus", Evaluator(기계적 검증)는 model: "sonnet". Generator는 메인 컨텍스트(opus)에서 직접 실행. 비용 절감과 품질을 동시에 달성

References

Handoff formats: references/handoff-formats.md — contract/evaluation-report/exec-plan 스키마
Example prompts: references/example-prompts.md — 도메인별 Mode 0/1/2 실전 프롬프트 예시

Handoff Files

에이전트 간 통신은 파일 기반. 포맷 상세: references/handoff-formats.md

Slug 규칙: 작업 시작 시 kebab-case slug를 결정 (예: loading-tsx, monitoring-cache-stats). 이 slug를 contract와 evaluation-report 파일명에 사용하여 다중 세션 동시 실행 시 충돌 방지.

File	Path	Producer → Consumer
exec-plan	`.claude/exec-plans/active/YYYY-MM-DD-{slug}.md`	Planner → Generator
contract	`.claude/contracts/{slug}.md`	Planner/Harness → Evaluator
evaluation-report	`.claude/evaluations/{slug}.md`	Evaluator → Generator/User

완료된 계획: active/ → completed/로 이동 (Step 7). 완료된 contract: .claude/contracts/{slug}.md → .claude/contracts/completed/{slug}.md로 이동 (Step 7). 인덱스: .claude/contracts/REGISTRY.md — active/backlog 목록. 새 contract 생성 후 REGISTRY.md Active 섹션에 추가. 기술 부채 추적: .claude/exec-plans/tech-debt-tracker.md (누적 관리).

Step 1: Determine Mode

Use explicit mode if user specifies (mode0, mode1, mode2). Otherwise, analyze request and auto-select.

Mode	Condition	Execution
0 (Direct)	≤3 files, no logic change (i18n, config, typo, docs, SKILL.md Step 추가)	Bypass harness
1 (Lightweight)	4~15 files, single domain, existing patterns	Generator → Evaluator loop
2 (Full)	15+ files AND (DB schema change OR new module OR multi-domain)	Planner → Generator → Evaluator

Mode 2 진입 게이트 (AND 조건): 파일 수가 많더라도 단일 도메인 + 기존 패턴 확장이면 Mode 1 선택. Mode 2는 Planner(opus) Agent 비용이 크므로 AND 조건 둘 다 충족할 때만 사용.

Mode 1 우선 원칙: Mode 1과 Mode 2 사이에서 판단이 어려울 경우 Mode 1을 선택한다. Planner가 없어도 Generator가 CLAUDE.md + exec-plan 없이 직접 구현 가능하면 Mode 1로 충분하다.

세션 컨텍스트 점검 (컨텍스트 비대 방지)

harness 시작 전 현재 세션이 무거운지 확인한다:

이번 세션에서 이미 harness를 1회 이상 완료했거나
"continue" / "계속" 으로 긴 대화를 이어가고 있다면

→ /clear 후 새 컨텍스트에서 harness 시작 권장. 5분 이상 공백이 생긴 경우에도 동일. 캐시 TTL(5분)이 지나면 전체 컨텍스트가 cold hit으로 재계산되어 캐시 브레이크가 발생한다.

Report determination to user in one line and confirm before proceeding. Mode 0 exits this skill immediately.

Step 2: Run Planner (Mode 2 only)

Mode 1 → skip to Step 3.

Launch Planner Agent with model: "opus" (전략적 설계 판단 — advisor 역할).

Directives:

Read CLAUDE.md for project rules and architecture
Explore related existing code (similar modules, established patterns)
Constrain deliverables without over-specifying implementation — define WHAT each file should achieve, NOT HOW to code it. Detailed technical specs cause cascading errors when assumptions break.
Context management: 실행 계획에 상세 내용을 담되, 에이전트가 필요한 정보에 순차적으로 접근할 수 있도록 구성. 한 파일에 모든 것을 넣지 않는다 — 목차에서 심층 문서로 단계적 진입.
Generate .claude/exec-plans/active/YYYY-MM-DD-{slug}.md — Phase-based plan with files and verification commands. docs/exec-plans/ 디렉토리가 없으면 자동 생성.
Generate .claude/contracts/{slug}.md — MUST/SHOULD criteria with domain-specific success criteria
Present both to user for approval before proceeding

Step 3: Prepare contract.md

Mode 2: Already created by Planner in Step 2. Skip.
Mode 1: Auto-generate lightweight contract from changed file analysis.

Read references/handoff-formats.md for contract.md schema.

Mode 1 default MUST criteria: tsc --noEmit + build + verify-implementation PASS + backend test PASS.

Save to .claude/contracts/{slug}.md. Create .claude/contracts/ directory if it doesn't exist.

Step 4: Run Generator

Implement code per exec-plan (Mode 2) or user request (Mode 1).

Generator constraints:

Follow CLAUDE.md Behavioral Guidelines (minimal code, surgical changes)
Mode 2: do NOT modify files not listed in exec-plan
Do NOT "improve" adjacent code — implement exactly what is asked
Run self-check after implementation: tsc --noEmit, basic build

Proceed to Step 5 when implementation is complete.

Step 5: Run Evaluator

CRITICAL: Launch as independent Agent with model: "sonnet". Do NOT self-evaluate.

Evaluator는 기계적 검증(grep 패턴, tsc, 빌드, 체크리스트 대조)이 주 업무이므로 sonnet이 적합하다. verify-* 스킬은 Grep 기반 패턴 매칭이고, MUST/SHOULD 판정은 바이너리 결과이므로 opus 수준의 판단력이 불필요하다.

Evaluator Agent prompt must include these directives:

You are a skeptical QA agent. Your job is to find problems, not to approve work.

IMPORTANT CALIBRATION:
- When you identify a legitimate issue, do NOT rationalize it away
- Do NOT say "this is minor" or "this is acceptable" for genuine failures
- If a contract criterion fails, mark it FAIL regardless of surrounding quality
- Grade against hard thresholds — partial credit does not exist for MUST criteria

Steps:
1. Read .claude/contracts/{slug}.md for success criteria
2. Run build verification (tsc, build, test as applicable)
3. Run verify-implementation workflow (existing 13 verify-* skills)
4. Run review-architecture (Mode 2 only)
5. [Frontend changes] Run playwright-e2e for runtime browser verification
   - 변경된 라우트의 렌더링, 인터랙션, 에러 상태를 브라우저에서 직접 확인
   - 정적 검증(tsc)이 잡지 못하는 런타임 동작 검증
6. Compare each MUST criterion against results → PASS/FAIL
7. Track "이전 반복 대비 변화" to detect repeated failures
8. Write .claude/evaluations/{slug}.md per handoff format

Do NOT modify any code. Report only.

Read references/handoff-formats.md for evaluations/{slug}.md schema.

Branch on result

Result	Action
All MUST PASS	→ Step 7 (final report)
FAIL + iteration < 3	→ Step 6 (fix loop)
FAIL + same issue 2x consecutive	→ Step 7 with "design-level issue, manual intervention needed"
FAIL + iteration ≥ 3	→ Step 7 with "max iterations reached, manual intervention needed"

SHOULD 기준 실패는 루프 차단하지 않는다. evaluations/{slug}.md에 기록하고 Step 7에서 후속 작업으로 분류.

Step 6: Fix Loop

Extract FAIL issues from evaluations/{slug}.md
Apply fixes per Evaluator's specific repair instructions
Return to Step 5

Track iteration count. Detect same-issue recurrence by comparing current FAIL issues against previous evaluations/{slug}.md entries (match by file path + issue description).

Step 7: Final Report

7a: 실행 계획 라이프사이클 완료

# exec-plan: active → completed 이동
mv .claude/exec-plans/active/YYYY-MM-DD-{slug}.md \
   .claude/exec-plans/completed/YYYY-MM-DD-{slug}.md

# contract: root → completed/ 이동
mv .claude/contracts/{slug}.md \
   .claude/contracts/completed/{slug}.md

# REGISTRY.md Active 섹션에서 해당 행 삭제
# SHOULD 실패 항목이 있으면 tech-debt-tracker.md에 추가
# 형식: - [ ] {이슈} — {파일:라인} — {날짜}

# tech-debt-tracker.md [x] 항목 정리
# [x]로 완료 표시된 항목은 git 이력으로 충분 — 아카이브 이동 불필요
grep -c "^- \[x\]" .claude/exec-plans/tech-debt-tracker.md 2>/dev/null | grep -v "^0$" && \
  sed -i '/^- \[x\]/d' .claude/exec-plans/tech-debt-tracker.md || true

Mode 1은 plan 파일 없으므로 생략. contract 이동은 Mode 1/2 모두 적용.

example-prompts 아카이브 이동

example-prompts.md에 해당 sprint 항목이 있으면 완료 후 반드시 아카이브로 이동한다.

1. example-prompts.md 에서 해당 sprint 블록(### 섹션 전체) 삭제
2. 도메인에 맞는 archive-*.md 파일 상단(---  바로 아래)에 완료 형식으로 추가:
   ## ~~{날짜} — {sprint-slug}~~ ✅ 완료 ({날짜})
   > 핵심 내용 요약 (무엇을, 왜, 결과)
   > 검증: M-N/N PASS
   > 커밋: {hash} {message}
3. archive-index.md 섹션 인덱스 표에 한 줄 추가 (최신 차수 맨 위)
4. example-prompts.md 상단 "마지막 정리일" 날짜 갱신

도메인 매핑:

내용	파일
스크립트/툴링/pre-push gate/E2E 인프라/CI	`archive-infra.md`
장비·점검·팀·SW 등 도메인 기능	`archive-domain.md`
데이터 마이그레이션	`archive-migration.md`
DOCX export / 양식	`archive-export.md`
UI/UX 디자인 리뷰	`archive-design.md`

7b: PR 라이프사이클 (PASS 시)

PASS → /git-commit → PR 생성 → 에이전트 리뷰어 할당 권장

PR 생성 후 에이전트 리뷰가 가능한 경우(review-architecture, review-design 등): 에이전트 리뷰어가 모두 통과할 때까지 /harness 루프를 재진입하지 않고 리뷰 피드백만 처리. 판단이 필요한 경우에만 사용자에게 에스컬레이션.

7c: Final Summary

## Harness Result

| Item | Result |
|------|--------|
| Mode | Mode {N} |
| Iterations | {N} |
| Verdict | PASS / FAIL (manual intervention) |
| Changed | {N} files, +{X}/-{Y} lines |

### Contract Status
| Criterion | Verdict |
|-----------|---------|

### Post-merge Actions
- PASS → /git-commit 실행 후 PR 생성
- SHOULD 실패 항목 → tech-debt-tracker.md 확인
- FAIL → evaluations/{slug}.md 검토 후 수동 수정
- **⚠️ example-prompts.md 정리 (필수)**: example-prompts.md에 이번 sprint 항목이 있으면 반드시 삭제 후 해당 archive-*.md로 이동. Step 7a의 "example-prompts 아카이브 이동" 4단계 참조. 누락 시 완료 항목이 example-prompts.md에 영구 잔류함.
- 엔트로피 점검 권장: `/harness entropy` (3회 이상 반복된 경우)
- **manage-skills 통합**: 이번 harness에서 새 패턴/규칙이 도입됐다면 Step 7 마지막에 `/manage-skills` 실행 (별도 세션 대신 현재 컨텍스트에서 흡수)
- **다음 sprint 준비**: 바로 다음 harness 진행 예정이라면 → `/clear` 후 신선한 컨텍스트에서 시작 권장 (캐시 TTL 5분 — 공백이 생기면 cold hit 발생)

Entropy Management (`/harness entropy`)

코드베이스 엔트로피는 에이전트 처리량에 비례해 증가한다. 가비지 컬렉션처럼 조금씩 자주 정리하는 것이 한꺼번에 갚는 것보다 훨씬 낫다.

황금 원칙 체크리스트

에이전트가 코드를 생성할 때 반복적으로 위반하는 패턴을 주기적으로 점검:

SSOT 드리프트 — 로컬 재정의된 타입/상수가 있는가? (verify-ssot)
하드코딩 확산 — API 경로, 쿼리키, 에러코드가 인라인으로 박혔는가? (verify-hardcoding)
레이어 위반 — 아키텍처 경계를 넘는 직접 임포트가 생겼는가? (review-architecture)
컨텍스트 비대화 — CLAUDE.md 또는 AGENTS.md가 단일 거대 매뉴얼로 변하고 있는가?
- 300줄 초과 시: 주제별 references/ 파일로 분리 권장
죽은 exec-plan — active/에 30일 이상 완료되지 않은 계획이 있는가?

실행

/harness entropy 호출 시:

위 5개 항목을 순서대로 점검
위반 항목을 tech-debt-tracker.md에 기록
자동 수정 가능한 것은 Mode 1 harness로 즉시 처리
설계 판단이 필요한 것은 사용자에게 제시

정기 실행 권장

3회 이상 harness를 실행했다면 주기적 entropy 점검을 권장한다. /schedule 스킬로 자동화 가능: 예) 매주 월요일 /harness entropy

Load-Bearing Analysis (`/harness load-bearing`)

Quarterly recommended.

List all harness components (Planner, contract, evaluation-report, each verify-*)
For each: tally issue detection rate over last 5 harness runs
Components with 0% detection rate → "removal candidate"
Report each component's encoded assumption and whether it remains valid

Skill Dependency Map

harness (this skill — orchestrator, runs in main context as opus)
  │
  ├── [Step 2] Planner Agent (model: opus) — 전략적 설계 판단
  ├── [Step 4] Generator — 메인 컨텍스트에서 직접 실행 (opus)
  ├── [Step 5] Evaluator Agent (model: sonnet) — 기계적 검증
  │     ├── verify-implementation → 13 verify-* skills
  │     ├── review-architecture (Mode 2)
  │     ├── review-design (frontend changes)
  │     └── playwright-e2e (frontend runtime verification)
  └── [Step 7] git-commit (post-success)

Model Selection Rationale (Advisor 패턴)

Role	Model	근거
Planner	`opus`	아키텍처 결정, 파일 구조 설계, 트레이드오프 판단 — 최고 지능 필요
Generator	`opus` (main)	코드 작성 품질이 루프 반복 횟수를 결정 — 한 번에 맞추는 게 경제적
Evaluator	`sonnet`	Grep 패턴 매칭, tsc/빌드 실행, 체크리스트 대조 — 바이너리 판정에 opus 불필요

Exceptions

Do NOT force harness on Mode 0 tasks — overkill for typos and config
Evaluator never modifies code — report only, Generator fixes
SHOULD criteria failures are NOT loop triggers — record in tech-debt-tracker, handle in follow-up PR
Test file pattern differences are not issues — inherit verify-* skill exceptions
Frontend-only changes — playwright-e2e를 Evaluator 마지막 단계에 추가 (브라우저 없으면 생략)

name	harness
description	3-Agent harness orchestrator (Planner → Generator → Evaluator loop). Auto-select execution mode based on task complexity. Reuse existing verify-/review- skills as Evaluator infrastructure. Trigger on "하네스", "/harness", "harness mode", or when starting non-trivial multi-file implementation tasks.
argument-hint	[요청 내용] 또는 [mode0\|mode1\|mode2\|load-bearing\|entropy]

Harness — Generator-Evaluator Orchestrator

OpenAI "Harness Engineering" 패턴 적용. 기존 verify-, review- 스킬을 Evaluator로 재사용하며, 자동 반복 루프로 품질 보장.

Core Principles

Generate and evaluate separately — self-evaluation bias is universal; always run Evaluator as independent Agent
Evaluate against contract, not intuition — contract.md의 명시적 MUST/SHOULD 기준으로만 PASS/FAIL 판정
Be a skeptical evaluator — identify legitimate issues, then do NOT talk yourself into approving them. If it fails a criterion, it fails. Period.
Constrain deliverables, not implementation — Planner는 "무엇을" 결정, Generator는 "어떻게" 결정. 구현 세부사항을 과잉 명세하면 cascading error 발생
Reuse existing infrastructure — 새 검증 로직 작성 금지. verify-, review- 스킬을 orchestrate
Repository is the record system — 실행 계획, 완료된 작업, 기술 부채는 모두 .claude/exec-plans/에 버전 관리. 외부 채팅/문서에 있는 컨텍스트는 에이전트가 볼 수 없음
Repair cost < waiting cost — SHOULD 기준 실패는 루프 차단 없이 후속 PR로 처리
Advisor 패턴: 전략은 비싸게, 실행은 싸게 — Planner(전략적 판단)는 model: "opus", Evaluator(기계적 검증)는 model: "sonnet". Generator는 메인 컨텍스트(opus)에서 직접 실행. 비용 절감과 품질을 동시에 달성

References

Handoff formats: references/handoff-formats.md — contract/evaluation-report/exec-plan 스키마
Example prompts: references/example-prompts.md — 도메인별 Mode 0/1/2 실전 프롬프트 예시

Handoff Files

에이전트 간 통신은 파일 기반. 포맷 상세: references/handoff-formats.md

File	Path	Producer → Consumer
exec-plan	`.claude/exec-plans/active/YYYY-MM-DD-{slug}.md`	Planner → Generator
contract	`.claude/contracts/{slug}.md`	Planner/Harness → Evaluator
evaluation-report	`.claude/evaluations/{slug}.md`	Evaluator → Generator/User

Step 1: Determine Mode

Use explicit mode if user specifies (mode0, mode1, mode2). Otherwise, analyze request and auto-select.

Mode	Condition	Execution
0 (Direct)	≤3 files, no logic change (i18n, config, typo, docs, SKILL.md Step 추가)	Bypass harness
1 (Lightweight)	4~15 files, single domain, existing patterns	Generator → Evaluator loop
2 (Full)	15+ files AND (DB schema change OR new module OR multi-domain)	Planner → Generator → Evaluator

세션 컨텍스트 점검 (컨텍스트 비대 방지)

harness 시작 전 현재 세션이 무거운지 확인한다:

이번 세션에서 이미 harness를 1회 이상 완료했거나
"continue" / "계속" 으로 긴 대화를 이어가고 있다면

Report determination to user in one line and confirm before proceeding. Mode 0 exits this skill immediately.

Step 2: Run Planner (Mode 2 only)

Mode 1 → skip to Step 3.

Launch Planner Agent with model: "opus" (전략적 설계 판단 — advisor 역할).

Directives:

Read CLAUDE.md for project rules and architecture
Explore related existing code (similar modules, established patterns)
Constrain deliverables without over-specifying implementation — define WHAT each file should achieve, NOT HOW to code it. Detailed technical specs cause cascading errors when assumptions break.
Context management: 실행 계획에 상세 내용을 담되, 에이전트가 필요한 정보에 순차적으로 접근할 수 있도록 구성. 한 파일에 모든 것을 넣지 않는다 — 목차에서 심층 문서로 단계적 진입.
Generate .claude/exec-plans/active/YYYY-MM-DD-{slug}.md — Phase-based plan with files and verification commands. docs/exec-plans/ 디렉토리가 없으면 자동 생성.
Generate .claude/contracts/{slug}.md — MUST/SHOULD criteria with domain-specific success criteria
Present both to user for approval before proceeding

Step 3: Prepare contract.md

Mode 2: Already created by Planner in Step 2. Skip.
Mode 1: Auto-generate lightweight contract from changed file analysis.

Read references/handoff-formats.md for contract.md schema.

Mode 1 default MUST criteria: tsc --noEmit + build + verify-implementation PASS + backend test PASS.

Save to .claude/contracts/{slug}.md. Create .claude/contracts/ directory if it doesn't exist.

Step 4: Run Generator

Implement code per exec-plan (Mode 2) or user request (Mode 1).

Generator constraints:

Follow CLAUDE.md Behavioral Guidelines (minimal code, surgical changes)
Mode 2: do NOT modify files not listed in exec-plan
Do NOT "improve" adjacent code — implement exactly what is asked
Run self-check after implementation: tsc --noEmit, basic build

Proceed to Step 5 when implementation is complete.

Step 5: Run Evaluator

CRITICAL: Launch as independent Agent with model: "sonnet". Do NOT self-evaluate.

Evaluator Agent prompt must include these directives:

You are a skeptical QA agent. Your job is to find problems, not to approve work.

IMPORTANT CALIBRATION:
- When you identify a legitimate issue, do NOT rationalize it away
- Do NOT say "this is minor" or "this is acceptable" for genuine failures
- If a contract criterion fails, mark it FAIL regardless of surrounding quality
- Grade against hard thresholds — partial credit does not exist for MUST criteria

Steps:
1. Read .claude/contracts/{slug}.md for success criteria
2. Run build verification (tsc, build, test as applicable)
3. Run verify-implementation workflow (existing 13 verify-* skills)
4. Run review-architecture (Mode 2 only)
5. [Frontend changes] Run playwright-e2e for runtime browser verification
   - 변경된 라우트의 렌더링, 인터랙션, 에러 상태를 브라우저에서 직접 확인
   - 정적 검증(tsc)이 잡지 못하는 런타임 동작 검증
6. Compare each MUST criterion against results → PASS/FAIL
7. Track "이전 반복 대비 변화" to detect repeated failures
8. Write .claude/evaluations/{slug}.md per handoff format

Do NOT modify any code. Report only.

Read references/handoff-formats.md for evaluations/{slug}.md schema.

Branch on result

Result	Action
All MUST PASS	→ Step 7 (final report)
FAIL + iteration < 3	→ Step 6 (fix loop)
FAIL + same issue 2x consecutive	→ Step 7 with "design-level issue, manual intervention needed"
FAIL + iteration ≥ 3	→ Step 7 with "max iterations reached, manual intervention needed"

SHOULD 기준 실패는 루프 차단하지 않는다. evaluations/{slug}.md에 기록하고 Step 7에서 후속 작업으로 분류.

Step 6: Fix Loop

Extract FAIL issues from evaluations/{slug}.md
Apply fixes per Evaluator's specific repair instructions
Return to Step 5

Track iteration count. Detect same-issue recurrence by comparing current FAIL issues against previous evaluations/{slug}.md entries (match by file path + issue description).

Step 7: Final Report

7a: 실행 계획 라이프사이클 완료

# exec-plan: active → completed 이동
mv .claude/exec-plans/active/YYYY-MM-DD-{slug}.md \
   .claude/exec-plans/completed/YYYY-MM-DD-{slug}.md

# contract: root → completed/ 이동
mv .claude/contracts/{slug}.md \
   .claude/contracts/completed/{slug}.md

# REGISTRY.md Active 섹션에서 해당 행 삭제
# SHOULD 실패 항목이 있으면 tech-debt-tracker.md에 추가
# 형식: - [ ] {이슈} — {파일:라인} — {날짜}

# tech-debt-tracker.md [x] 항목 정리
# [x]로 완료 표시된 항목은 git 이력으로 충분 — 아카이브 이동 불필요
grep -c "^- \[x\]" .claude/exec-plans/tech-debt-tracker.md 2>/dev/null | grep -v "^0$" && \
  sed -i '/^- \[x\]/d' .claude/exec-plans/tech-debt-tracker.md || true

Mode 1은 plan 파일 없으므로 생략. contract 이동은 Mode 1/2 모두 적용.

example-prompts 아카이브 이동

example-prompts.md에 해당 sprint 항목이 있으면 완료 후 반드시 아카이브로 이동한다.

1. example-prompts.md 에서 해당 sprint 블록(### 섹션 전체) 삭제
2. 도메인에 맞는 archive-*.md 파일 상단(---  바로 아래)에 완료 형식으로 추가:
   ## ~~{날짜} — {sprint-slug}~~ ✅ 완료 ({날짜})
   > 핵심 내용 요약 (무엇을, 왜, 결과)
   > 검증: M-N/N PASS
   > 커밋: {hash} {message}
3. archive-index.md 섹션 인덱스 표에 한 줄 추가 (최신 차수 맨 위)
4. example-prompts.md 상단 "마지막 정리일" 날짜 갱신

도메인 매핑:

내용	파일
스크립트/툴링/pre-push gate/E2E 인프라/CI	`archive-infra.md`
장비·점검·팀·SW 등 도메인 기능	`archive-domain.md`
데이터 마이그레이션	`archive-migration.md`
DOCX export / 양식	`archive-export.md`
UI/UX 디자인 리뷰	`archive-design.md`

7b: PR 라이프사이클 (PASS 시)

PASS → /git-commit → PR 생성 → 에이전트 리뷰어 할당 권장

7c: Final Summary

## Harness Result

| Item | Result |
|------|--------|
| Mode | Mode {N} |
| Iterations | {N} |
| Verdict | PASS / FAIL (manual intervention) |
| Changed | {N} files, +{X}/-{Y} lines |

### Contract Status
| Criterion | Verdict |
|-----------|---------|

### Post-merge Actions
- PASS → /git-commit 실행 후 PR 생성
- SHOULD 실패 항목 → tech-debt-tracker.md 확인
- FAIL → evaluations/{slug}.md 검토 후 수동 수정
- **⚠️ example-prompts.md 정리 (필수)**: example-prompts.md에 이번 sprint 항목이 있으면 반드시 삭제 후 해당 archive-*.md로 이동. Step 7a의 "example-prompts 아카이브 이동" 4단계 참조. 누락 시 완료 항목이 example-prompts.md에 영구 잔류함.
- 엔트로피 점검 권장: `/harness entropy` (3회 이상 반복된 경우)
- **manage-skills 통합**: 이번 harness에서 새 패턴/규칙이 도입됐다면 Step 7 마지막에 `/manage-skills` 실행 (별도 세션 대신 현재 컨텍스트에서 흡수)
- **다음 sprint 준비**: 바로 다음 harness 진행 예정이라면 → `/clear` 후 신선한 컨텍스트에서 시작 권장 (캐시 TTL 5분 — 공백이 생기면 cold hit 발생)

Entropy Management (`/harness entropy`)

코드베이스 엔트로피는 에이전트 처리량에 비례해 증가한다. 가비지 컬렉션처럼 조금씩 자주 정리하는 것이 한꺼번에 갚는 것보다 훨씬 낫다.

황금 원칙 체크리스트

에이전트가 코드를 생성할 때 반복적으로 위반하는 패턴을 주기적으로 점검:

SSOT 드리프트 — 로컬 재정의된 타입/상수가 있는가? (verify-ssot)
하드코딩 확산 — API 경로, 쿼리키, 에러코드가 인라인으로 박혔는가? (verify-hardcoding)
레이어 위반 — 아키텍처 경계를 넘는 직접 임포트가 생겼는가? (review-architecture)
컨텍스트 비대화 — CLAUDE.md 또는 AGENTS.md가 단일 거대 매뉴얼로 변하고 있는가?
- 300줄 초과 시: 주제별 references/ 파일로 분리 권장
죽은 exec-plan — active/에 30일 이상 완료되지 않은 계획이 있는가?

실행

/harness entropy 호출 시:

위 5개 항목을 순서대로 점검
위반 항목을 tech-debt-tracker.md에 기록
자동 수정 가능한 것은 Mode 1 harness로 즉시 처리
설계 판단이 필요한 것은 사용자에게 제시

정기 실행 권장

3회 이상 harness를 실행했다면 주기적 entropy 점검을 권장한다. /schedule 스킬로 자동화 가능: 예) 매주 월요일 /harness entropy

Load-Bearing Analysis (`/harness load-bearing`)

Quarterly recommended.

List all harness components (Planner, contract, evaluation-report, each verify-*)
For each: tally issue detection rate over last 5 harness runs
Components with 0% detection rate → "removal candidate"
Report each component's encoded assumption and whether it remains valid

Skill Dependency Map

harness (this skill — orchestrator, runs in main context as opus)
  │
  ├── [Step 2] Planner Agent (model: opus) — 전략적 설계 판단
  ├── [Step 4] Generator — 메인 컨텍스트에서 직접 실행 (opus)
  ├── [Step 5] Evaluator Agent (model: sonnet) — 기계적 검증
  │     ├── verify-implementation → 13 verify-* skills
  │     ├── review-architecture (Mode 2)
  │     ├── review-design (frontend changes)
  │     └── playwright-e2e (frontend runtime verification)
  └── [Step 7] git-commit (post-success)

Model Selection Rationale (Advisor 패턴)

Role	Model	근거
Planner	`opus`	아키텍처 결정, 파일 구조 설계, 트레이드오프 판단 — 최고 지능 필요
Generator	`opus` (main)	코드 작성 품질이 루프 반복 횟수를 결정 — 한 번에 맞추는 게 경제적
Evaluator	`sonnet`	Grep 패턴 매칭, tsc/빌드 실행, 체크리스트 대조 — 바이너리 판정에 opus 불필요

Exceptions

Do NOT force harness on Mode 0 tasks — overkill for typos and config
Evaluator never modifies code — report only, Generator fixes
SHOULD criteria failures are NOT loop triggers — record in tech-debt-tracker, handle in follow-up PR
Test file pattern differences are not issues — inherit verify-* skill exceptions
Frontend-only changes — playwright-e2e를 Evaluator 마지막 단계에 추가 (브라우저 없으면 생략)

harness

Harness — Generator-Evaluator Orchestrator

Core Principles

References

Handoff Files

Step 1: Determine Mode

세션 컨텍스트 점검 (컨텍스트 비대 방지)

Step 2: Run Planner (Mode 2 only)

Step 3: Prepare contract.md

Step 4: Run Generator

Step 5: Run Evaluator

Branch on result

Step 6: Fix Loop

Step 7: Final Report

7a: 실행 계획 라이프사이클 완료

example-prompts 아카이브 이동

7b: PR 라이프사이클 (PASS 시)

7c: Final Summary

Entropy Management (/harness entropy)

황금 원칙 체크리스트

실행

정기 실행 권장

Load-Bearing Analysis (/harness load-bearing)

Skill Dependency Map

Model Selection Rationale (Advisor 패턴)

Exceptions

Harness — Generator-Evaluator Orchestrator

Core Principles

References

Handoff Files

Step 1: Determine Mode

세션 컨텍스트 점검 (컨텍스트 비대 방지)

Step 2: Run Planner (Mode 2 only)

Step 3: Prepare contract.md

Step 4: Run Generator

Step 5: Run Evaluator

Branch on result

Step 6: Fix Loop

Step 7: Final Report

7a: 실행 계획 라이프사이클 완료

example-prompts 아카이브 이동

7b: PR 라이프사이클 (PASS 시)

7c: Final Summary

Entropy Management (/harness entropy)

황금 원칙 체크리스트

실행

정기 실행 권장

Load-Bearing Analysis (/harness load-bearing)

Skill Dependency Map

Model Selection Rationale (Advisor 패턴)

Exceptions

Entropy Management (`/harness entropy`)

Load-Bearing Analysis (`/harness load-bearing`)

Entropy Management (`/harness entropy`)

Load-Bearing Analysis (`/harness load-bearing`)