Run any Skill in Manus with one click

$pwd:

game-verify

Name: Game Verify
Author: AirMile

// Human playtest verification with structured feedback and fix loop. Use with /game-verify after /game-build for manual testing of game features.

Run Skill in Manus

$ git log --oneline --stat

stars:0

forks:0

updated:May 6, 2026 at 09:43

SKILL.md

readonly

package.json

"author": "AirMile"

"repository": "AirMile/claude-config"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	game-verify
description	Human playtest verification with structured feedback and fix loop. Use with /game-verify after /game-build for manual testing of game features.
reads	["feature.requirements","feature.build","backlog.stage"]
writes	["feature.tests","backlog.stage"]
metadata	{"author":"mileszeilstra","version":"2.3.0","category":"game"}

Verify

Overview

This is FASE 3 of the gamedev workflow: plan -> define -> build -> verify -> refactor

The verify phase handles human playtest verification of implemented game features through structured feedback, intelligent issue categorization, and iterative fix loops until all items pass.

Trigger: /game-verify or /game-verify {feature-name} or /game-verify {feature-name} {feedback}

When to Use

This skill activates in these scenarios:

Primary use:

After /game-build completes implementation
When .project/features/{name}/feature.json exists with tests.checklist[]
When human verification is needed for game mechanics

Context indicators:

Feature has been implemented with build phase
Playtest checklist exists in feature.json tests.checklist[]
User wants to verify gameplay feels correct

NOT for:

Initial feature planning (use /game-define)
Implementation (use /game-build)
Automated unit testing (use /dev-verify)

Input Formats

Format 1: Inline feedback (recommended)

/game-verify water-ability
1:PASS
2:PASS
3:FAIL puddle too small
4:FAIL no sound

Format 2: Feature name only (shows checklist first)

/game-verify water-ability

Format 3: Free text feedback

/game-verify water-ability
Everything works except puddle is too small and there's no sound

Code quality rules: ../shared/RULES.md (R009)

Feedback Categorization

Type	Example	Action
TESTABLE	"puddle radius=50, should be 100"	TDD fix loop
MEASURABLE	"animation too slow"	Direct fix + re-test
SUBJECTIVE	"doesn't feel right"	Ask for details

TESTABLE -> TDD Fix Loop

Feedback: "puddle radius 50, should be 100"
     |
Generate test: test_puddle_radius_is_100()
     |
Run test -> FAIL
     |
Fix code
     |
Run test -> PASS
     |
"Fixed. Re-test item 3."

MEASURABLE -> Direct Fix

Feedback: "animation too slow"
     |
Adjust animation_speed from 1.0 to 1.5
     |
"Fixed. Re-test item 4."
(No automated test possible)

SUBJECTIVE -> Ask Details

Feedback: "doesn't feel right"
     |
"Can you be more specific?
- Too fast/slow?
- Too strong/weak?
- Wrong timing?
- Something else?"
     |
User: "damage feels too low, expected 30 not 20"
     |
Now TESTABLE -> TDD fix loop

Workflow

Fase tracking — eerste actie van de skill: roep TaskCreate aan met deze 10 items (status pending), daarna gebruik TaskUpdate om per fase in_progress te zetten aan begin en completed aan einde. Bij context compaction blijft de task list zichtbaar — geen risico op vergeten fases.

FASE 0: Load Context
FASE 1: Parse Feedback
FASE 1b: Debug Analysis
FASE 2: Categorize Issues
FASE 3: Fix Loop
FASE 4: Generate Re-test Checklist
FASE 5: Re-test Loop
FASE 5c: Regression Check
FASE 5d: Requirement Verification
FASE 6: Completion

FASE 0: Load Context

Todo: roep TaskCreate aan met de 10 fase-items (zie boven). Markeer FASE 0 → in_progress via TaskUpdate.

Goal: Load playtest checklist from build phase and prepare for feedback.

Steps:

Check backlog for BLT features (if no feature name provided):
```
Read(".project/backlog.html")
```
- If backlog exists: parse JSON uit <script id="backlog-data"> blok (zie shared/BACKLOG.md)
- Filter built features: data.features.filter(f => f.status === "DOING" && f.stage === "built")
Use AskUserQuestion if built features found:
- header: "Feature"
- question: "Welke feature wil je testen? ({N} features in built stage)"
- options:
  - label: "{feature-name} (Recommended)", description: "Volgende uit backlog"
  - label: "Andere feature", description: "Ik wil een andere feature testen"
- multiSelect: false
Parse user input:
- If feature name only -> show checklist, wait for feedback
- If feature name + feedback -> parse feedback immediately
- If "recent" -> find most recently modified feature.json with tests.checklist
Worktree switch — voer de procedure in shared/WORKTREE.md uit met de feature-name. Switcht automatisch naar worktree-{feature-name} als die bestaat. Bij FAIL (in andere worktree dan de feature): stop met de melding uit WORKTREE.md.

Locate playtest checklist:

.project/features/{feature-name}/feature.json → tests.checklist[]

Validate feature.json exists with tests.checklist:

If not found:

NOT FOUND: feature.json with tests.checklist

Feature: {feature-name}
Searched: .project/features/{feature-name}/

This feature needs to be built first.
Run /game-build {feature-name} to implement and generate playtest checklist.

-> Exit skill

Read playtest checklist + classify items:
- Parse tests.checklist[] from feature.json
- Note expected behavior for each item (from title field)
- Count total items
- Classify each item:
  - COVERED: GUT unit tests from /game-build already verify this requirement. Check tests/test_{feature}.gd for matching test functions. COVERED items zijn al geverifieerd — skip in playtest.
  - MANUAL: Requires human verification (gameplay feel, visuals, audio, game launch). Alles dat niet COVERED is.
- Display classificatie:
```
CHECKLIST CLASSIFICATIE:

COVERED ({N} items — skip, al geverifieerd door GUT tests):
- Item {X}: {description} → test_{function}()

MANUAL ({M} items — playtest nodig):
- Item {Y}: {description}
```
Als alle items COVERED → skip playtest, ga naar FASE 6 completion.

Goal-backward verificatie — map tests terug naar acceptance criteria:

Filter: requirements met deltaOp === "REMOVED" overslaan — niet opnemen in de mapping.

Bouw mapping vanuit feature.json requirements[].acceptance ([{ when, then }] objecten) en geclassificeerde items. Elk { when, then } scenario = één rij:

REQ When Then (verwacht) Test Items Dekking
REQ-1 vijand geraakt door aanval vijand neemt schade Item 1, 3 ✓
REQ-2 critical hit geregistreerd knockback wordt toegepast — GAP

GAP: requirement zonder test items (COVERED of MANUAL). MISMATCH: test items die implementation details verifiëren i.p.v. observable gameplay (test title refereert interne functies i.p.v. speler-zichtbaar gedrag).

Geen gaps, geen mismatches → toon Acceptance mapping: {n}/{n} REQs gedekt en ga door.

Gaps of mismatches → AskUserQuestion:
- "Accepteer en ga door (Recommended)" — noteer, proceed
- "Test items aanpassen" — voeg items toe voor gaps, herformuleer mismatches

Verify playtest scene exists:

Scene path: .project/features/{feature-name}/playtest_scene.tscn

If scene exists:

PLAYTEST SCENE FOUND
Path: .project/features/{feature-name}/playtest_scene.tscn
Debug listener: Active

If scene NOT found:

ERROR: Playtest scene not found

Expected: .project/features/{feature-name}/playtest_scene.tscn

The /game-build phase should have created this scene.
Run /game-build {feature-name} first, or check if build completed successfully.

-> Exit skill

6b. Post-Build Baseline Check (als build sectie bestaat in feature.json)

Twee checks vóór playtest:

Check 1: Full GUT Regression Suite

Run de volledige GUT test suite om te verifiëren dat alle features nog werken:

"{godot_executable}" --headless --path . -s addons/gut/gut_cmdln.gd -gexit

Parse output (zelfde regels als game-build: PASS 1 regel, FAIL max 10 regels).

BASELINE: full suite → {passed}/{total} PASS

Bij failures:

Onderscheid failures van de HUIDIGE feature vs ANDERE features
Huidige feature fails → waarschuw, ga door met playtest (dit is wat we gaan testen)
Andere feature fails → waarschuw:
```
⚠ REGRESSIE: {N} tests van andere features falen
- test_{other}.test_xxx: {reason}
```
Use AskUserQuestion:
- "Doorgaan met playtest (Recommended)" — "Regressies worden gerapporteerd maar blokkeren de test niet"
- "Stop — eerst regressies fixen" — "Fix de andere features voordat je deze test"

Als GUT niet beschikbaar of geen test bestanden → skip met: BASELINE: overgeslagen (geen GUT tests gevonden)

Check 2: Integration Test Scene

Herrun de integration test scene als aanvullende check:

"{godot_executable}" --headless --path . -s res://tests/scenes/test_{feature}_runtime.tscn

Parse output voor FINAL:PASS of FINAL:FAIL.

Display: BASELINE: integration tests → {PASS|FAIL} Bij FAIL: waarschuw ("Integration tests falen — mogelijke regressie sinds build"), toon gefaalde tests, ga door met playtest.

Als integration test scene niet bestaat → skip, geen output.

6c. Cross-Requirement Gameplay Scenario's (als build sectie bestaat en 2+ requirements)

Analyseer requirements[] uit feature.json. Identificeer combinaties waar gameplay-interacties meerdere requirements raken.

Genereer maximaal 3 gameplay scenario's:

GAMEPLAY SCENARIO'S: {feature}

| # | Scenario                                      | Requirements     |
|---|-----------------------------------------------|------------------|
| G1| Gebruik ability → observeer effect → check cooldown | REQ-001 + REQ-003 |
| G2| Ability op meerdere vijanden → effects stapelen?    | REQ-001 + REQ-002 |
| G3| Ability tijdens beweging → positie correct?         | REQ-001 + REQ-004 |

Voeg deze toe aan de checklist als extra items met "integration": true, "type": "MANUAL". Ze worden meegenomen in de playtest instructies.

Als er geen logische cross-requirement combinaties zijn → skip, geen output.

Launch game with test scenario:

mcp__godot-mcp__run_project(
    projectPath=".",
    scene=".project/features/{feature-name}/playtest_scene.tscn"
)

Display test scenario from feature.json tests.checklist:

GAME LAUNCHED - {feature-name}
==============================

TEST SCENARIO:

Stap 1: {first action from checklist}
Stap 2: {second action}
Stap 3: {third action}
...

VERWACHT GEDRAG:
- {expected result 1}
- {expected result 2}
- {expected result 3}

Debug tracking is actief - je acties worden gelogd.

Voer deze acties uit in de game.
Sluit de game als je klaar bent.

Note: Game runs in background. DebugListener captures all debug_* signals.

Wait for user completion:

When user closes game or indicates ready, use AskUserQuestion tool:
- header: "Test Resultaat"
- question: "Hoe ging de test?"
- options:
  - label: "Alles werkt (Aanbevolen)", description: "Alle stappen werken zoals verwacht"
  - label: "Er zijn problemen", description: "Sommige dingen werken niet goed"
  - label: "Game crashte", description: "De game stopte onverwacht"
- multiSelect: false
Response handling:

If "Alles werkt": -> Skip to FASE 6 (Completion), mark all items PASS

If "Er zijn problemen": -> Proceed to FASE 1b (Debug Analysis + User Details)

If "Game crashte":
```
crash_output = mcp__godot-mcp__get_debug_output()
```
-> Analyze crash, show error, offer to fix via TDD loop

Display checklist (if no feedback provided):

PLAYTEST CHECKLIST: {feature-name}

| # | Test | Expected |
|---|------|----------|
| 1 | {test description} | {expected behavior} |
| 2 | {test description} | {expected behavior} |
| 3 | {test description} | {expected behavior} |
| 4 | {test description} | {expected behavior} |

Play the game and report results.

Feedback formats:
- Quick: "1:PASS 2:PASS 3:FAIL too small 4:FAIL no sound"
- Detailed: "Items 1-2 work, item 3 puddle too small, item 4 missing sound"

-> Wait for user feedback

Output (if feedback provided):

PLAYTEST LOADED

Feature: {feature-name}
Items: {count}
Feedback: received

-> Parsing feedback...

Tag backlog + capture baseline:

Backlog: lees .project/backlog.html (als bestaat), parse JSON (zie shared/BACKLOG.md). Zoek feature op naam → zet "stage": "testing", data.updated → nu (Edit, keep <script> tags intact)
Git baseline + session file:

mkdir -p .project/session
git status --porcelain | sort > .project/session/pre-skill-status.txt
echo '{"feature":"{feature-name}","skill":"test","startedAt":"{ISO timestamp}"}' > .project/session/active-{feature-name}.json

FASE 1: Parse Feedback

Todo: markeer FASE 0 → completed, FASE 1 → in_progress.

Goal: Extract PASS/FAIL status and notes from user feedback.

Steps:

Detect feedback format:
- Numbered format: 1:PASS 2:FAIL note
- Free text: Parse natural language

Parse numbered format:

# Pattern: {number}:{PASS|FAIL} [optional notes]
for match in feedback:
    item_number = match.number
    status = match.status  # PASS or FAIL
    notes = match.notes    # optional description

Parse free text format:
- Identify positive words: "works", "good", "correct", "fine"
- Identify negative words: "fails", "broken", "wrong", "missing"
- Map descriptions to checklist items
- Extract issue details
Handle ambiguous input:

If cannot parse:

Use AskUserQuestion tool:
- header: "Feedback Onduidelijk"
- question: "Ik kon de feedback niet goed parsen. Kun je het in dit formaat geven?"
- options:
  - label: "Opnieuw invoeren (Aanbevolen)", description: "Gebruik formaat: 1:PASS 2:FAIL [notes]"
  - label: "Per item doorgaan", description: "Ik vraag per item of het PASS of FAIL is"
  - label: "Uitleg", description: "Leg de feedback formaten uit"
- multiSelect: false

Build results array:

results = [
    {"item": 1, "status": "PASS", "notes": None},
    {"item": 2, "status": "PASS", "notes": None},
    {"item": 3, "status": "FAIL", "notes": "puddle too small"},
    {"item": 4, "status": "FAIL", "notes": "no sound"},
]

Output:

FEEDBACK PARSED

| # | Status | Notes |
|---|--------|-------|
| 1 | PASS | - |
| 2 | PASS | - |
| 3 | FAIL | puddle too small |
| 4 | FAIL | no sound |

Passed: 2 items
Failed: 2 items

-> Categorizing issues...

FASE 1b: Debug Analysis

Todo: markeer FASE 1 → completed, FASE 1b → in_progress.

Goal: Combine debug output with user feedback for accurate issue identification.

When to use: This fase runs when user selected "Er zijn problemen" in step 7.

Steps:

Capture debug output:

debug_output = mcp__godot-mcp__get_debug_output()

Parse debug log:

Extract all [DEBUG] and [PLAYTEST] lines
Build timeline of events
Identify signal emissions and method calls

DEBUG TIMELINE:

00:01.234 [DEBUG] WaterAbility.execute() called - target: TestTarget
00:01.235 [DEBUG] WaterAbility.execute() complete - damage: 20
00:01.456 [PLAYTEST] debug_ability_used: {damage: 20, target: "TestTarget"}
00:02.100 [DEBUG] Puddle.spawn() called - position: (100, 200)
00:02.789 [PLAYTEST] debug_puddle_spawned: {position: (100, 200)}
...

Toon checklist en vraag welke items niet werkten:
```
Checklist items ({N} totaal):

1. [Visuals] {item-1}
2. [Controls] {item-2}
3. [Audio] {item-3}
...
```
Vraag: "Welke items werkten niet? Geef nummers (bv. 1, 3, 5) of geen als alles werkte."

Parse → failure-set, ga door naar Step 4 voor specifics per geselecteerd item.

For each selected problem, ask specifics:

Je selecteerde: "Item 3: Puddle spawnt op impact"

Wat was het probleem precies?
- Puddle verscheen niet?
- Verkeerde positie?
- Verkeerde grootte?
- Anders?

Wait for user description.

Correlate with debug output:

ISSUE ANALYSIS: Item 3

User feedback: "Puddle verscheen niet"

Debug log analysis:
✓ debug_ability_used signal: FOUND at 00:01.456
✗ debug_puddle_spawned signal: NOT FOUND

Expected sequence:
1. ability.execute() -> ✓
2. puddle.spawn() -> ✗ NOT CALLED

Conclusion: spawn_puddle() method was never invoked
Likely cause: Missing call in execute() after damage calculation

Generate enriched feedback for FASE 2:

Convert to structured feedback with debug context:

results = [
    {
        "item": 3,
        "status": "FAIL",
        "notes": "puddle not spawning",
        "debug_context": {
            "missing_signals": ["debug_puddle_spawned"],
            "last_signal": "debug_ability_used",
            "root_cause": "spawn_puddle() not called"
        }
    },
]

Output:

DEBUG ANALYSIS COMPLETE

Issues identified: {count}
Debug correlation: {matched}/{total} items have debug evidence

| # | Issue | Debug Evidence |
|---|-------|----------------|
| 3 | No puddle | debug_puddle_spawned missing |
| 4 | No sound | debug_sound_played missing |

Root causes identified: {count}

-> Proceeding to categorize issues with debug context...

-> Continue to FASE 2 (Categorize Issues) with enriched feedback

FASE 2: Categorize Issues

Todo: markeer FASE 1b → completed, FASE 2 → in_progress.

Goal: Determine fix approach for each failed item.

Steps:

For each FAIL item, analyze notes:

Check for TESTABLE indicators:
- Concrete values mentioned (numbers, sizes, durations)
- Comparison stated ("should be X not Y")
- Measurable property with expected value
Examples:
- "radius 50, should be 100" -> TESTABLE
- "damage 20, expected 30" -> TESTABLE
- "speed 5.0, needs to be 10.0" -> TESTABLE
Check for MEASURABLE indicators:
- Relative terms without values ("too slow", "too fast")
- Observable properties ("animation", "movement", "timing")
- Can be adjusted but not unit-tested
Examples:
- "animation too slow" -> MEASURABLE
- "projectile too fast" -> MEASURABLE
- "knockback too weak" -> MEASURABLE
Default to SUBJECTIVE:
- Vague feedback ("doesn't feel right", "weird")
- No specific property mentioned
- Requires clarification
Examples:
- "feels off" -> SUBJECTIVE
- "not right" -> SUBJECTIVE
- "something wrong" -> SUBJECTIVE
Handle SUBJECTIVE issues immediately:

For each SUBJECTIVE item, use AskUserQuestion tool:
- header: "Verduidelijking Item {N}"
- question: "'{notes}' is niet specifiek genoeg. Wat is er precies mis?"
- options: (context-dependent, examples below)
  - label: "Te snel/langzaam", description: "Timing of snelheid probleem"
  - label: "Te sterk/zwak", description: "Damage, kracht, of effect probleem"
  - label: "Verkeerde timing", description: "Wanneer iets gebeurt klopt niet"
  - label: "Visueel probleem", description: "Hoe het eruit ziet klopt niet"
  - label: "Audio probleem", description: "Geluid mist of klopt niet"
  - label: "Anders", description: "Ik beschrijf het specifiek"
- multiSelect: false
After clarification:
- Re-analyze with new details
- Update category (TESTABLE or MEASURABLE)

Build categorized issues list:

issues = [
    {"item": 3, "type": "TESTABLE", "notes": "puddle radius 50, should be 100", "action": "TDD fix"},
    {"item": 4, "type": "MEASURABLE", "notes": "no sound on cast", "action": "Direct fix"},
]

Output:

FEEDBACK ANALYSIS

PASSED: 2 items (1, 2)
FAILED: 2 items (3, 4)

| # | Issue | Type | Action |
|---|-------|------|--------|
| 3 | puddle radius 50->100 | TESTABLE | TDD fix loop |
| 4 | no sound on cast | MEASURABLE | Direct fix |

-> Starting fix loop...

FASE 3: Fix Loop

Todo: markeer FASE 2 → completed, FASE 3 → in_progress.

Goal: Fix all issues using appropriate method for each type.

Process for each issue (in order):

For TESTABLE Issues: TDD Fix Loop

Step 0: Assess Fix Complexity

Before fixing, determine if research is needed:

Complexity	Example	Research?
Simple	Change a number value	No
Medium	Add new property/method	No
Complex	Refactor signal flow, add state	Yes

Complexity indicators:

SIMPLE (no research):
- "puddle radius 50 -> 100"       -> Just change the value
- "damage too low"                -> Just increase the value
- "missing sound"                 -> Just add AudioStreamPlayer
- "animation speed wrong"         -> Just adjust the speed

COMPLEX (offer research):
- "ability doesn't chain correctly"    -> Signal flow issue
- "state machine not transitioning"    -> State logic issue
- "cooldown resets unexpectedly"       -> Timer/state interaction
- "collision not detecting properly"   -> Physics layer issue
- "node references breaking"           -> Scene tree / lifecycle issue

If complex fix detected:

Use AskUserQuestion tool:

header: "Research"
question: "Dit is een complexe fix ({brief issue description}). Wil je Godot patterns researchen?"
options:
- label: "Ja, research (Aanbevolen)", description: "Research beste aanpak"
- label: "Nee, direct fixen", description: "Fix zonder research"
multiSelect: false

If research requested:

Task(subagent_type="godot-code-researcher", prompt="
Feature: {feature-name}
Fix needed: {description of issue}

Current code:
{relevant code snippet}

Problem: {what's wrong}
Goal: {what should happen}

Research GDScript patterns for this fix.
")

Use research findings to inform the fix implementation below.

Step 1-5: TDD Fix (potentially informed by research)

Generate test based on feedback:

GENERATING TEST for item {N}

Issue: {description}
Expected: {concrete value from feedback}

Write test file:

# tests/test_{feature}_{item}.gd
extends GutTest

func test_{specific_behavior}() -> void:
    # Arrange
    var {object} = {setup}

    # Act
    var result = {action}

    # Assert
    assert_eq(result, {expected_value}, "{description}")

Run test (expect FAIL):

"{godot_executable}" --headless --path . -s addons/gut/gut_cmdln.gd -gexit -gtest=res://tests/test_{feature}_{item}.gd

If test PASSES (unexpected):

UNEXPECTED: Test already passes

The test passes with current code.
Possible causes:
- Issue was already fixed
- Test doesn't capture the real problem
- Feedback was based on old version

Use AskUserQuestion tool:

header: "Test Passed"
question: "De test slaagt al. Wat wil je doen?"
options:
- label: "Overslaan (Aanbevolen)", description: "Item lijkt al gefixt, ga naar volgende"
- label: "Test aanpassen", description: "De test klopt niet, ik geef nieuwe waarden"
- label: "Handmatig checken", description: "Stop en check dit handmatig"
multiSelect: false

If test FAILS (expected):

TEST FAILS (expected)

Test: test_{specific_behavior}
Expected: {expected_value}
Actual: {actual_value}

-> Implementing fix...

Implement fix:
- Locate relevant code
- Make minimal change to satisfy test
- Document what was changed

Run test again (expect PASS):

TEST PASSES

Item {N} fixed via TDD.
Change: {description of fix}
File: {file:line}

If test still fails after fix:
- Analyze why
- Try alternative approach
- Max 3 attempts before asking user

For MEASURABLE Issues: Direct Fix

Identify code location:

DIRECT FIX for item {N}

Issue: {description}
Location: {file:line}
Current value: {current}

Apply fix directly:
- Adjust value/property
- No test possible (subjective/feel)
- Document the change

Report fix:

FIXED (cannot auto-verify)

Change: {what was changed}
From: {old value}
To: {new value}
File: {file:line}

Needs manual re-test.

After all issues processed:

FIX LOOP COMPLETE

| # | Type | Status | Change |
|---|------|--------|--------|
| 3 | TESTABLE | FIXED (test passes) | Puddle radius 50->100 |
| 4 | MEASURABLE | FIXED (needs re-test) | Added cast sound |

New tests added: 1
Files modified: 2

-> Generating re-test checklist...

FASE 4: Generate Re-test Checklist

Todo: markeer FASE 3 → completed, FASE 4 → in_progress.

Goal: Create minimal checklist for only the fixed items.

Steps:

Filter to fixed items only:
- Skip items that passed in original feedback
- Include only items that were fixed

Generate re-test checklist:

RE-TEST REQUIRED

The following items were fixed and need verification:

| # | Test | Change Made |
|---|------|-------------|
| 3 | Puddle size | Now 100px (was 50px) |
| 4 | Sound on cast | Added AudioStreamPlayer |

Play the game and verify these specific items.

Feedback format:
3:PASS
4:PASS

Or if still failing:
3:FAIL still too small
4:PASS

Wait for re-test feedback

Output:

AWAITING RE-TEST

Fixed items: {count}
Provide feedback when ready.

FASE 5: Re-test Loop

Todo: markeer FASE 4 → completed, FASE 5 → in_progress.

Goal: Process re-test feedback and loop until all pass.

Steps:

Parse re-test feedback:
- Same parsing as FASE 1
- Only expect results for fixed items

Evaluate results:

If all re-tests PASS: -> Continue to FASE 6 (Completion)

If any re-tests FAIL:

RE-TEST RESULTS

| # | Status | Notes |
|---|--------|-------|
| 3 | FAIL | still too small |
| 4 | PASS | - |

1 item still failing.

Handle persistent failures:

Use AskUserQuestion tool:
- header: "Item {N} Faalt Nog"
- question: "Item {N} werkt nog niet na fix. Wat wil je doen?"
- options:
  - label: "Meer details geven (Aanbevolen)", description: "Ik geef specifiekere feedback"
  - label: "Andere aanpak", description: "Probeer een andere fix strategie"
  - label: "Accepteren zoals het is", description: "Markeer als acceptabel voor nu"
  - label: "Handmatig fixen", description: "Stop en fix het zelf"
- multiSelect: false
Loop back to FASE 2:
- Re-categorize new feedback
- Apply new fixes
- Generate new re-test checklist
- Continue until all pass or user exits

FASE 5c: Regression Check

Todo: markeer FASE 5 → completed, FASE 5c → in_progress.

Skip when:

Geen TDD fixes toegepast in FASE 3 (alleen MEASURABLE fixes → lage kans op side effects)
Geen bestaande test files om te draaien

Doel: Verifieer dat fixes geen eerder-werkende tests hebben gebroken.

Draai gut_cmdln.gd opnieuw voor ALLE bestaande test files (niet alleen de gefixte). Vergelijk output met de FASE 0.6b baseline.

REGRESSION CHECK: {feature-name}

GUT test suite opnieuw gedraaid...

Baseline: {n} pass / {n} fail
Nu:       {n} pass / {n} fail

Nieuwe failures:
- test_{x}.gd::test_{method}: {assertion error}

Regressies: {n} | Stabiel: {n}

Geen regressies: Door naar FASE 6.

Regressies: Toon en bied keuze via AskUserQuestion: Fixen (Aanbevolen) | Accepteren. Bij fixen → terug naar FASE 3 voor alleen de regressie-items. Herhaal FASE 5c NIET na regressie-fix (max 1 pass).

FASE 5d: Requirement Verification

Todo: markeer FASE 5c → completed, FASE 5d → in_progress.

Skip when: Alle tests FAIL (coverage check zinloos bij catastrofale failures).

Cross-check feature.json requirements tegen test resultaten:

Laad requirement → test mapping:
- Per requirements[] entry (id, description, status) — skip entries met deltaOp === "REMOVED"
- Zoek matching tests.checklist[] entries via requirementId

Bouw coverage matrix:

REQUIREMENT COVERAGE: {feature-name}

| REQ       | Beschrijving (kort)        | Tests | Status        |
|-----------|----------------------------|-------|---------------|
| REQ-001   | {eerste 40 chars}          | 2     | ✓ COVERED     |
| REQ-002   | {eerste 40 chars}          | 0     | ✗ GEEN TEST   |
| REQ-003   | {eerste 40 chars}          | 1     | ⊘ BLOCKED     |
| REQ-004   | {eerste 40 chars}          | 0     | ? UNCLEAR     |

Coverage: {covered}/{total} requirements ({percentage}%)
Non-testable: BLOCKED={n} UNCLEAR={n} (heropenen nodig)

Classificeer per requirement:
- COVERED: minstens 1 test met matching requirementId EN status PASS
- FAIL: minstens 1 test matching maar status FAIL
- BLOCKED: test bestaat niet of faalt door externe dependency (missing asset, addon niet geladen, export-preset ontbreekt)
- UNCLEAR: geen test mogelijk omdat acceptance criteria te vaag is (bv. "voelt juicy", "is fun") — niet-deterministisch
- GEEN TEST: geen test in checklist[] met matching requirementId (geen legitieme reden)
Alle requirements COVERED: toon compact samenvatting, door naar FASE 6.
Bij GEEN TEST, FAIL, BLOCKED of UNCLEAR requirements:

Per ongedekt requirement, AskUserQuestion:
```
header: "REQ niet gedekt: {REQ-ID}"
question: "{requirement description} — geen test gevonden. Wat wil je doen?"
options:
  - label: "Test toevoegen (Recommended)", description: "Schrijf een test voor dit requirement"
  - label: "Gedekt door andere test", description: "Impliciet getest via een andere test"
  - label: "Blocked door dependency", description: "Missing asset/addon/preset — niet testbaar nu"
  - label: "Criteria te vaag", description: "Acceptance criteria mist concreetheid — heropen /game-define"
multiSelect: false
```
- Test toevoegen → voeg test item toe aan tests.checklist[] met requirementId, status: "pending". Loop terug naar FASE 1 voor GUT test of FASE 2 (MANUAL) voor alleen dit item.
- Gedekt door andere test → vraag welke test het dekt. Markeer requirement met implicitCoverage: "{REQ-ID} test also validates this via {beschrijving}". Status → "PASS".
- Blocked door dependency → vraag welke dependency. Status → "BLOCKED", voeg toe aan requirements[].evidence = "blocked by: {reason}". Niet mergen-blokkerend; signaal voor heropenen na dependency-fix.
- Criteria te vaag → vraag wat vaag is. Status → "UNCLEAR", voeg toe aan requirements[].evidence = "needs clarification: {what's vague}". Signaal voor /game-define heropen om concrete acceptance te formuleren.

FASE 6: Completion

Todo: markeer FASE 5d → completed, FASE 6 → in_progress.

Goal: Sync user on fixes, capture observations, mark feature as verified and update documentation.

Step 0: Fix Sync (only when fixes were applied in FASE 3)

Skip this step if all items passed on first attempt (no fixes needed).

The Fix Sync ensures the user understands what changed in the codebase during the test-fix cycle.

0a) Claude summarizes — per fix, in plain language:

FIX SYNC: {feature-name}
=========================

{For each fix applied:}

Fix {N}: {item title}
- Problem: {what was wrong, in plain language}
- Change: {what was modified} ({file:line})
- Approach: {why this fix, not an alternative — only if non-obvious}
- Watch out: {anything the user should know — only if relevant}

{Example:}

Fix 1: Puddle too small
- Problem: Puddle radius was 50px, user expected 100px
- Change: Doubled PUDDLE_RADIUS constant (scripts/abilities/water_ability.gd:12)

Fix 2: No sound on cast
- Problem: AudioStreamPlayer was missing from the ability scene
- Change: Added AudioStreamPlayer2D with cast_sound.ogg (scenes/abilities/water_ability.tscn)
- Watch out: Sound uses AudioBus "SFX" — make sure this bus exists in project audio settings

0b) Comprehension check via AskUserQuestion:

header: "Fix Sync"
question: "Snap je de fixes die zijn toegepast?"
options:
- label: "Ja, helder (Aanbevolen)", description: "Ik begrijp wat er is veranderd en waarom"
- label: "Leg meer uit", description: "Geef een uitgebreidere uitleg met voorbeelden"
- label: "Ik heb een vraag", description: "Ik wil iets specifieks vragen"
multiSelect: false

If "Leg meer uit" → explain each fix in more detail with before/after examples, then re-ask. If "Ik heb een vraag" → answer the question, then re-ask. Loop until "Ja, helder".

0c) Save fix sync — store the summary for inclusion in feature.json tests.sessions[].

Step 0b: Out-of-scope Observations (always — even without fixes)

The user was actively playtesting and may have noticed issues outside the current feature scope. Capture these before closing out.

Use AskUserQuestion tool:

header: "Observaties"
question: "Is je tijdens het playtesten nog iets anders opgevallen buiten de scope van deze feature?"
options:
- label: "Nee, alles goed (Aanbevolen)", description: "Geen verdere opmerkingen"
- label: "Ja, ik heb iets opgemerkt", description: "Ik wil iets noteren voor later"
multiSelect: false

If "Ja" → ask the user to describe what they noticed (plain text, no modal). Record the observations for inclusion in feature.json observations[]. Do NOT attempt to fix these — they are out of scope.

After documenting, show confirmation:

OBSERVATIE GENOTEERD

Opgenomen in test results.

Steps:

Confirm all items pass:

{FEATURE-NAME} COMPLETE!

All {N} playtest items passed.

| # | Test | Status |
|---|------|--------|
| 1 | {description} | PASS |
| 2 | {description} | PASS |
| 3 | {description} | PASS |
| 4 | {description} | PASS |

Feature ready for integration.

Parallel sync (feature.json + backlog + project.json + project-context.json):

Lees parallel (skip als niet bestaat):
- .project/features/{feature-name}/feature.json
- .project/backlog.html
- .project/project.json
- .project/project-context.json
Muteer in memory:

feature.json: status → "DONE", requirements[].status → "PASS" / "FAIL" / "BLOCKED" / "UNCLEAR" per item (BLOCKED/UNCLEAR includeren evidence string), tests.checklist[].status → update per item with evidence. Add/update tests sectie: finalStatus ("PASSED" alle PASS / "FAILED" ≥1 FAIL / "PARTIAL" ≥1 BLOCKED of UNCLEAR, 0 FAIL), sessions[] (push { date, pass, fail, fixes }), fixSync, verificationCheckpoint (gaps, mismatches, adjustments). Add observations[] if user reported out-of-scope issues. NIET andere secties overschrijven.

Backlog (zie shared/BACKLOG.md): zet .status = "DONE", data.updated → huidige datum.

project.json: Feature status → "DONE". Merge nieuwe packages als relevant.

project-context.json: Bij fixes in FASE 3: update architecture.components[] — merge gewijzigde bestanden naar component src/test, bevestig status: "done".

Learning Extraction — extracteer projectbrede learnings uit de voltooide feature:

Lees de zojuist geschreven feature.json en evalueer (verplichte source-tag per bron):
- build.decisions[] → type pattern, source extracted (architecturale keuzes die andere features beïnvloeden)
- tests.fixSync[] en tests.sessions[].fixes → type pitfall, source extracted (bugs met root causes)
- observations[] → type observation, source inferred (cross-feature inzichten)
Filter: alleen items die relevant zijn buiten deze ene feature. Skip feature-specifieke implementatiedetails.

Append naar project-context.json → learnings[]:
```
{
  "date": "YYYY-MM-DD",
  "feature": "{feature-name}",
  "type": "pattern|pitfall|observation",
  "source": "extracted|inferred",
  "summary": "Max 200 chars samenvatting"
}
```
Dedup voor elke candidate learning:
1. Exact shortcut: zelfde feature + zelfde summary → skip (geen Jaccard nodig)
2. Tokeniseer candidate.summary via shared/LEARNING-EXTRACTION.md Dedup Tokenizer
3. Voor elke bestaande learning in learnings[] met hetzelfde type:
  - Jaccard(candidate.tokens, existing.tokens) >= 0.55 → skip candidate
4. Overleeft beide checks → append
Geen learnings gevonden → skip.

Schrijf parallel terug:
- Write feature.json
- Edit backlog.html (keep <script> tags intact)
- Write project.json
- Write project-context.json (als context/architecture/learnings gewijzigd)
Scoped auto-commit (only this skill's changes):

Compare current git status with baseline from FASE 0:
```
git status --porcelain | sort > /tmp/current-status.txt
```
Categorize files by comparing with .project/session/pre-skill-status.txt:
- NEW (only in current, not in baseline) → git add automatically
- OVERLAP (in both baseline AND current) → warn user via AskUserQuestion: "These files had pre-existing uncommitted changes and were also modified by this skill: {list}. Include in commit?" Options: "Include (Recommended)" / "Skip"
- PRE-EXISTING (only in baseline) → do NOT stage
If baseline file doesn't exist, fall back to git add -A.
```
git commit -m "$(cat <<'EOF'
test({feature}): verified - all {N} items pass

Playtest verification complete.
- Fixed: {list of fixes}
- Tests added: {count}
EOF
)"
```
Clean up: rm -f .project/session/pre-skill-status.txt .project/session/active-{feature-name}.json /tmp/current-status.txt

Output:

VERIFICATION COMPLETE

Feature: {feature-name}
Status: DONE
Items: {N}/{N} passing

Committed: test({feature}): verified

Next steps:
  1. /game-refactor → code quality check + learnings extractie
  2. /game-define {next-feature} → volgende feature oppakken

Worktree integration hint — voeg één extra regel toe als beide voorwaarden waar zijn:

Huidige branch matcht worktree-* pattern (git branch --show-current)
Feature is na deze run op status: "DONE" in backlog

Append:

💡 Feature klaar — run /core-merge {feature-name} om te integreren naar main/develop

Output Structure

.project/features/{feature-name}/
└── feature.json    # Enriched with tests section (finalStatus, sessions, checklist updates, observations)

Example Session

User: /game-verify water-ability

Claude: PLAYTEST CHECKLIST: water-ability

        | # | Test | Expected |
        |---|------|----------|
        | 1 | Water damage visible | Enemy shows damage effect |
        | 2 | Heal effect on player | Health increases, visual feedback |
        | 3 | Puddle on ground | Circular puddle appears at impact |
        | 4 | Sound on cast | Whoosh sound plays |

        Play the game and report results.
        Format: 1:PASS/FAIL [notes]

User: 1:PASS
      2:PASS
      3:FAIL too small
      4:FAIL no sound

Claude: FEEDBACK ANALYSIS

        PASSED: 2 items (1, 2)
        FAILED: 2 items (3, 4)

        | # | Issue | Type | Action |
        |---|-------|------|--------|
        | 3 | "too small" | SUBJECTIVE | Need details |
        | 4 | "no sound" | MEASURABLE | Direct fix |

        [AskUserQuestion: "Verduidelijking Item 3"
         question: "'too small' is niet specifiek genoeg. Wat bedoel je?"
         options: "2x groter", "3x groter", "Specifieke radius (px)", "Anders"]

User: 2x groter

Claude: Now testable! Starting TDD fix loop...

        GENERATING TEST for item 3
        Issue: Puddle size should be 2x current

        [Creates test_water_ability_puddle_size.gd]
        [Runs test -> FAILS as expected]
        [Implements fix: radius * 2]
        [Runs test -> PASSES]

        DIRECT FIX for item 4
        Issue: No sound on cast
        [Adds AudioStreamPlayer to scene]

        FIX LOOP COMPLETE

        | # | Type | Status |
        |---|------|--------|
        | 3 | TESTABLE | FIXED (test passes) |
        | 4 | MEASURABLE | FIXED (needs re-test) |

        RE-TEST REQUIRED

        | # | Test | Change Made |
        |---|------|-------------|
        | 3 | Puddle size | Now 2x original |
        | 4 | Sound on cast | Added AudioStreamPlayer |

User: 3:PASS
      4:PASS

Claude: WATER-ABILITY COMPLETE!

        All 4 playtest items passed.

        Feature ready for integration.
        Committed: test(water-ability): verified

Best Practices

Language

Follow the Language Policy in CLAUDE.md. AskUserQuestion labels in user's preferred language.

TDD for Testable Issues

Always write test FIRST
Run test to confirm it fails
Make minimal fix to pass test
Tests prevent regression

Direct Fixes for Measurable Issues

Some things can't be unit tested (feel, timing, subjective)
Make the change, document it
Rely on human re-test

Clarification for Subjective Issues

Never guess what user means
Ask specific options based on context
Convert to TESTABLE or MEASURABLE before fixing

Minimal Re-test

Only ask user to re-test fixed items
Don't re-test items that already passed
Respect user's time

Restrictions

This skill must NEVER:

Skip TDD for testable issues (concrete values given)
Guess what subjective feedback means
Apply fixes without documenting changes
Mark complete if any items still failing
Re-test items that already passed
Skip clarification questions for vague feedback

This skill must ALWAYS:

Show playtest checklist before asking for feedback
Parse all feedback formats (numbered, free text)
Categorize each failure (TESTABLE/MEASURABLE/SUBJECTIVE)
Use TDD loop for testable issues
Ask clarifying questions for subjective issues
Generate re-test checklist with only fixed items
Loop until all items pass
Update documentation on completion

Path Resolution

{godot_executable} in commands wordt opgelost via paths.yaml:

macOS: /Applications/Godot.app/Contents/MacOS/Godot
Windows: C:\Godot\Godot_v4.4.1-stable_win64.exe

Override: env var CLAUDE_GODOT_EXECUTABLE of .claude/paths.local.yaml. Canonical defaults staan in skills/project-add/paths.yaml.

Todo: markeer FASE 6 → completed.

name	game-verify
description	Human playtest verification with structured feedback and fix loop. Use with /game-verify after /game-build for manual testing of game features.
reads	["feature.requirements","feature.build","backlog.stage"]
writes	["feature.tests","backlog.stage"]
metadata	{"author":"mileszeilstra","version":"2.3.0","category":"game"}

Verify

Overview

This is FASE 3 of the gamedev workflow: plan -> define -> build -> verify -> refactor

The verify phase handles human playtest verification of implemented game features through structured feedback, intelligent issue categorization, and iterative fix loops until all items pass.

Trigger: /game-verify or /game-verify {feature-name} or /game-verify {feature-name} {feedback}

When to Use

This skill activates in these scenarios:

Primary use:

After /game-build completes implementation
When .project/features/{name}/feature.json exists with tests.checklist[]
When human verification is needed for game mechanics

Context indicators:

Feature has been implemented with build phase
Playtest checklist exists in feature.json tests.checklist[]
User wants to verify gameplay feels correct

NOT for:

Initial feature planning (use /game-define)
Implementation (use /game-build)
Automated unit testing (use /dev-verify)

Input Formats

Format 1: Inline feedback (recommended)

/game-verify water-ability
1:PASS
2:PASS
3:FAIL puddle too small
4:FAIL no sound

Format 2: Feature name only (shows checklist first)

/game-verify water-ability

Format 3: Free text feedback

/game-verify water-ability
Everything works except puddle is too small and there's no sound

Code quality rules: ../shared/RULES.md (R009)

Feedback Categorization

Type	Example	Action
TESTABLE	"puddle radius=50, should be 100"	TDD fix loop
MEASURABLE	"animation too slow"	Direct fix + re-test
SUBJECTIVE	"doesn't feel right"	Ask for details

TESTABLE -> TDD Fix Loop

Feedback: "puddle radius 50, should be 100"
     |
Generate test: test_puddle_radius_is_100()
     |
Run test -> FAIL
     |
Fix code
     |
Run test -> PASS
     |
"Fixed. Re-test item 3."

MEASURABLE -> Direct Fix

Feedback: "animation too slow"
     |
Adjust animation_speed from 1.0 to 1.5
     |
"Fixed. Re-test item 4."
(No automated test possible)

SUBJECTIVE -> Ask Details

Feedback: "doesn't feel right"
     |
"Can you be more specific?
- Too fast/slow?
- Too strong/weak?
- Wrong timing?
- Something else?"
     |
User: "damage feels too low, expected 30 not 20"
     |
Now TESTABLE -> TDD fix loop

Workflow

FASE 0: Load Context
FASE 1: Parse Feedback
FASE 1b: Debug Analysis
FASE 2: Categorize Issues
FASE 3: Fix Loop
FASE 4: Generate Re-test Checklist
FASE 5: Re-test Loop
FASE 5c: Regression Check
FASE 5d: Requirement Verification
FASE 6: Completion

FASE 0: Load Context

Todo: roep TaskCreate aan met de 10 fase-items (zie boven). Markeer FASE 0 → in_progress via TaskUpdate.

Goal: Load playtest checklist from build phase and prepare for feedback.

Steps:

Check backlog for BLT features (if no feature name provided):
```
Read(".project/backlog.html")
```
- If backlog exists: parse JSON uit <script id="backlog-data"> blok (zie shared/BACKLOG.md)
- Filter built features: data.features.filter(f => f.status === "DOING" && f.stage === "built")
Use AskUserQuestion if built features found:
- header: "Feature"
- question: "Welke feature wil je testen? ({N} features in built stage)"
- options:
  - label: "{feature-name} (Recommended)", description: "Volgende uit backlog"
  - label: "Andere feature", description: "Ik wil een andere feature testen"
- multiSelect: false
Parse user input:
- If feature name only -> show checklist, wait for feedback
- If feature name + feedback -> parse feedback immediately
- If "recent" -> find most recently modified feature.json with tests.checklist
Worktree switch — voer de procedure in shared/WORKTREE.md uit met de feature-name. Switcht automatisch naar worktree-{feature-name} als die bestaat. Bij FAIL (in andere worktree dan de feature): stop met de melding uit WORKTREE.md.

Locate playtest checklist:

.project/features/{feature-name}/feature.json → tests.checklist[]

Validate feature.json exists with tests.checklist:

If not found:

NOT FOUND: feature.json with tests.checklist

Feature: {feature-name}
Searched: .project/features/{feature-name}/

This feature needs to be built first.
Run /game-build {feature-name} to implement and generate playtest checklist.

-> Exit skill

Read playtest checklist + classify items:
- Parse tests.checklist[] from feature.json
- Note expected behavior for each item (from title field)
- Count total items
- Classify each item:
  - COVERED: GUT unit tests from /game-build already verify this requirement. Check tests/test_{feature}.gd for matching test functions. COVERED items zijn al geverifieerd — skip in playtest.
  - MANUAL: Requires human verification (gameplay feel, visuals, audio, game launch). Alles dat niet COVERED is.
- Display classificatie:
```
CHECKLIST CLASSIFICATIE:

COVERED ({N} items — skip, al geverifieerd door GUT tests):
- Item {X}: {description} → test_{function}()

MANUAL ({M} items — playtest nodig):
- Item {Y}: {description}
```
Als alle items COVERED → skip playtest, ga naar FASE 6 completion.

Goal-backward verificatie — map tests terug naar acceptance criteria:

Filter: requirements met deltaOp === "REMOVED" overslaan — niet opnemen in de mapping.

Bouw mapping vanuit feature.json requirements[].acceptance ([{ when, then }] objecten) en geclassificeerde items. Elk { when, then } scenario = één rij:

REQ When Then (verwacht) Test Items Dekking
REQ-1 vijand geraakt door aanval vijand neemt schade Item 1, 3 ✓
REQ-2 critical hit geregistreerd knockback wordt toegepast — GAP

GAP: requirement zonder test items (COVERED of MANUAL). MISMATCH: test items die implementation details verifiëren i.p.v. observable gameplay (test title refereert interne functies i.p.v. speler-zichtbaar gedrag).

Geen gaps, geen mismatches → toon Acceptance mapping: {n}/{n} REQs gedekt en ga door.

Gaps of mismatches → AskUserQuestion:
- "Accepteer en ga door (Recommended)" — noteer, proceed
- "Test items aanpassen" — voeg items toe voor gaps, herformuleer mismatches

Verify playtest scene exists:

Scene path: .project/features/{feature-name}/playtest_scene.tscn

If scene exists:

PLAYTEST SCENE FOUND
Path: .project/features/{feature-name}/playtest_scene.tscn
Debug listener: Active

If scene NOT found:

ERROR: Playtest scene not found

Expected: .project/features/{feature-name}/playtest_scene.tscn

The /game-build phase should have created this scene.
Run /game-build {feature-name} first, or check if build completed successfully.

-> Exit skill

6b. Post-Build Baseline Check (als build sectie bestaat in feature.json)

Twee checks vóór playtest:

Check 1: Full GUT Regression Suite

Run de volledige GUT test suite om te verifiëren dat alle features nog werken:

"{godot_executable}" --headless --path . -s addons/gut/gut_cmdln.gd -gexit

Parse output (zelfde regels als game-build: PASS 1 regel, FAIL max 10 regels).

BASELINE: full suite → {passed}/{total} PASS

Bij failures:

Onderscheid failures van de HUIDIGE feature vs ANDERE features
Huidige feature fails → waarschuw, ga door met playtest (dit is wat we gaan testen)
Andere feature fails → waarschuw:
```
⚠ REGRESSIE: {N} tests van andere features falen
- test_{other}.test_xxx: {reason}
```
Use AskUserQuestion:
- "Doorgaan met playtest (Recommended)" — "Regressies worden gerapporteerd maar blokkeren de test niet"
- "Stop — eerst regressies fixen" — "Fix de andere features voordat je deze test"

Als GUT niet beschikbaar of geen test bestanden → skip met: BASELINE: overgeslagen (geen GUT tests gevonden)

Check 2: Integration Test Scene

Herrun de integration test scene als aanvullende check:

"{godot_executable}" --headless --path . -s res://tests/scenes/test_{feature}_runtime.tscn

Parse output voor FINAL:PASS of FINAL:FAIL.

Display: BASELINE: integration tests → {PASS|FAIL} Bij FAIL: waarschuw ("Integration tests falen — mogelijke regressie sinds build"), toon gefaalde tests, ga door met playtest.

Als integration test scene niet bestaat → skip, geen output.

6c. Cross-Requirement Gameplay Scenario's (als build sectie bestaat en 2+ requirements)

Analyseer requirements[] uit feature.json. Identificeer combinaties waar gameplay-interacties meerdere requirements raken.

Genereer maximaal 3 gameplay scenario's:

GAMEPLAY SCENARIO'S: {feature}

| # | Scenario                                      | Requirements     |
|---|-----------------------------------------------|------------------|
| G1| Gebruik ability → observeer effect → check cooldown | REQ-001 + REQ-003 |
| G2| Ability op meerdere vijanden → effects stapelen?    | REQ-001 + REQ-002 |
| G3| Ability tijdens beweging → positie correct?         | REQ-001 + REQ-004 |

Voeg deze toe aan de checklist als extra items met "integration": true, "type": "MANUAL". Ze worden meegenomen in de playtest instructies.

Als er geen logische cross-requirement combinaties zijn → skip, geen output.

Launch game with test scenario:

mcp__godot-mcp__run_project(
    projectPath=".",
    scene=".project/features/{feature-name}/playtest_scene.tscn"
)

Display test scenario from feature.json tests.checklist:

GAME LAUNCHED - {feature-name}
==============================

TEST SCENARIO:

Stap 1: {first action from checklist}
Stap 2: {second action}
Stap 3: {third action}
...

VERWACHT GEDRAG:
- {expected result 1}
- {expected result 2}
- {expected result 3}

Debug tracking is actief - je acties worden gelogd.

Voer deze acties uit in de game.
Sluit de game als je klaar bent.

Note: Game runs in background. DebugListener captures all debug_* signals.

Wait for user completion:

When user closes game or indicates ready, use AskUserQuestion tool:
- header: "Test Resultaat"
- question: "Hoe ging de test?"
- options:
  - label: "Alles werkt (Aanbevolen)", description: "Alle stappen werken zoals verwacht"
  - label: "Er zijn problemen", description: "Sommige dingen werken niet goed"
  - label: "Game crashte", description: "De game stopte onverwacht"
- multiSelect: false
Response handling:

If "Alles werkt": -> Skip to FASE 6 (Completion), mark all items PASS

If "Er zijn problemen": -> Proceed to FASE 1b (Debug Analysis + User Details)

If "Game crashte":
```
crash_output = mcp__godot-mcp__get_debug_output()
```
-> Analyze crash, show error, offer to fix via TDD loop

Display checklist (if no feedback provided):

PLAYTEST CHECKLIST: {feature-name}

| # | Test | Expected |
|---|------|----------|
| 1 | {test description} | {expected behavior} |
| 2 | {test description} | {expected behavior} |
| 3 | {test description} | {expected behavior} |
| 4 | {test description} | {expected behavior} |

Play the game and report results.

Feedback formats:
- Quick: "1:PASS 2:PASS 3:FAIL too small 4:FAIL no sound"
- Detailed: "Items 1-2 work, item 3 puddle too small, item 4 missing sound"

-> Wait for user feedback

Output (if feedback provided):

PLAYTEST LOADED

Feature: {feature-name}
Items: {count}
Feedback: received

-> Parsing feedback...

Tag backlog + capture baseline:

Backlog: lees .project/backlog.html (als bestaat), parse JSON (zie shared/BACKLOG.md). Zoek feature op naam → zet "stage": "testing", data.updated → nu (Edit, keep <script> tags intact)
Git baseline + session file:

mkdir -p .project/session
git status --porcelain | sort > .project/session/pre-skill-status.txt
echo '{"feature":"{feature-name}","skill":"test","startedAt":"{ISO timestamp}"}' > .project/session/active-{feature-name}.json

FASE 1: Parse Feedback

Todo: markeer FASE 0 → completed, FASE 1 → in_progress.

Goal: Extract PASS/FAIL status and notes from user feedback.

Steps:

Detect feedback format:
- Numbered format: 1:PASS 2:FAIL note
- Free text: Parse natural language

Parse numbered format:

# Pattern: {number}:{PASS|FAIL} [optional notes]
for match in feedback:
    item_number = match.number
    status = match.status  # PASS or FAIL
    notes = match.notes    # optional description

Parse free text format:
- Identify positive words: "works", "good", "correct", "fine"
- Identify negative words: "fails", "broken", "wrong", "missing"
- Map descriptions to checklist items
- Extract issue details
Handle ambiguous input:

If cannot parse:

Use AskUserQuestion tool:
- header: "Feedback Onduidelijk"
- question: "Ik kon de feedback niet goed parsen. Kun je het in dit formaat geven?"
- options:
  - label: "Opnieuw invoeren (Aanbevolen)", description: "Gebruik formaat: 1:PASS 2:FAIL [notes]"
  - label: "Per item doorgaan", description: "Ik vraag per item of het PASS of FAIL is"
  - label: "Uitleg", description: "Leg de feedback formaten uit"
- multiSelect: false

Build results array:

results = [
    {"item": 1, "status": "PASS", "notes": None},
    {"item": 2, "status": "PASS", "notes": None},
    {"item": 3, "status": "FAIL", "notes": "puddle too small"},
    {"item": 4, "status": "FAIL", "notes": "no sound"},
]

Output:

FEEDBACK PARSED

| # | Status | Notes |
|---|--------|-------|
| 1 | PASS | - |
| 2 | PASS | - |
| 3 | FAIL | puddle too small |
| 4 | FAIL | no sound |

Passed: 2 items
Failed: 2 items

-> Categorizing issues...

FASE 1b: Debug Analysis

Todo: markeer FASE 1 → completed, FASE 1b → in_progress.

Goal: Combine debug output with user feedback for accurate issue identification.

When to use: This fase runs when user selected "Er zijn problemen" in step 7.

Steps:

Capture debug output:

debug_output = mcp__godot-mcp__get_debug_output()

Parse debug log:

Extract all [DEBUG] and [PLAYTEST] lines
Build timeline of events
Identify signal emissions and method calls

DEBUG TIMELINE:

00:01.234 [DEBUG] WaterAbility.execute() called - target: TestTarget
00:01.235 [DEBUG] WaterAbility.execute() complete - damage: 20
00:01.456 [PLAYTEST] debug_ability_used: {damage: 20, target: "TestTarget"}
00:02.100 [DEBUG] Puddle.spawn() called - position: (100, 200)
00:02.789 [PLAYTEST] debug_puddle_spawned: {position: (100, 200)}
...

Toon checklist en vraag welke items niet werkten:
```
Checklist items ({N} totaal):

1. [Visuals] {item-1}
2. [Controls] {item-2}
3. [Audio] {item-3}
...
```
Vraag: "Welke items werkten niet? Geef nummers (bv. 1, 3, 5) of geen als alles werkte."

Parse → failure-set, ga door naar Step 4 voor specifics per geselecteerd item.

For each selected problem, ask specifics:

Je selecteerde: "Item 3: Puddle spawnt op impact"

Wat was het probleem precies?
- Puddle verscheen niet?
- Verkeerde positie?
- Verkeerde grootte?
- Anders?

Wait for user description.

Correlate with debug output:

ISSUE ANALYSIS: Item 3

User feedback: "Puddle verscheen niet"

Debug log analysis:
✓ debug_ability_used signal: FOUND at 00:01.456
✗ debug_puddle_spawned signal: NOT FOUND

Expected sequence:
1. ability.execute() -> ✓
2. puddle.spawn() -> ✗ NOT CALLED

Conclusion: spawn_puddle() method was never invoked
Likely cause: Missing call in execute() after damage calculation

Generate enriched feedback for FASE 2:

Convert to structured feedback with debug context:

results = [
    {
        "item": 3,
        "status": "FAIL",
        "notes": "puddle not spawning",
        "debug_context": {
            "missing_signals": ["debug_puddle_spawned"],
            "last_signal": "debug_ability_used",
            "root_cause": "spawn_puddle() not called"
        }
    },
]

Output:

DEBUG ANALYSIS COMPLETE

Issues identified: {count}
Debug correlation: {matched}/{total} items have debug evidence

| # | Issue | Debug Evidence |
|---|-------|----------------|
| 3 | No puddle | debug_puddle_spawned missing |
| 4 | No sound | debug_sound_played missing |

Root causes identified: {count}

-> Proceeding to categorize issues with debug context...

-> Continue to FASE 2 (Categorize Issues) with enriched feedback

FASE 2: Categorize Issues

Todo: markeer FASE 1b → completed, FASE 2 → in_progress.

Goal: Determine fix approach for each failed item.

Steps:

For each FAIL item, analyze notes:

Check for TESTABLE indicators:
- Concrete values mentioned (numbers, sizes, durations)
- Comparison stated ("should be X not Y")
- Measurable property with expected value
Examples:
- "radius 50, should be 100" -> TESTABLE
- "damage 20, expected 30" -> TESTABLE
- "speed 5.0, needs to be 10.0" -> TESTABLE
Check for MEASURABLE indicators:
- Relative terms without values ("too slow", "too fast")
- Observable properties ("animation", "movement", "timing")
- Can be adjusted but not unit-tested
Examples:
- "animation too slow" -> MEASURABLE
- "projectile too fast" -> MEASURABLE
- "knockback too weak" -> MEASURABLE
Default to SUBJECTIVE:
- Vague feedback ("doesn't feel right", "weird")
- No specific property mentioned
- Requires clarification
Examples:
- "feels off" -> SUBJECTIVE
- "not right" -> SUBJECTIVE
- "something wrong" -> SUBJECTIVE
Handle SUBJECTIVE issues immediately:

For each SUBJECTIVE item, use AskUserQuestion tool:
- header: "Verduidelijking Item {N}"
- question: "'{notes}' is niet specifiek genoeg. Wat is er precies mis?"
- options: (context-dependent, examples below)
  - label: "Te snel/langzaam", description: "Timing of snelheid probleem"
  - label: "Te sterk/zwak", description: "Damage, kracht, of effect probleem"
  - label: "Verkeerde timing", description: "Wanneer iets gebeurt klopt niet"
  - label: "Visueel probleem", description: "Hoe het eruit ziet klopt niet"
  - label: "Audio probleem", description: "Geluid mist of klopt niet"
  - label: "Anders", description: "Ik beschrijf het specifiek"
- multiSelect: false
After clarification:
- Re-analyze with new details
- Update category (TESTABLE or MEASURABLE)

Build categorized issues list:

issues = [
    {"item": 3, "type": "TESTABLE", "notes": "puddle radius 50, should be 100", "action": "TDD fix"},
    {"item": 4, "type": "MEASURABLE", "notes": "no sound on cast", "action": "Direct fix"},
]

Output:

FEEDBACK ANALYSIS

PASSED: 2 items (1, 2)
FAILED: 2 items (3, 4)

| # | Issue | Type | Action |
|---|-------|------|--------|
| 3 | puddle radius 50->100 | TESTABLE | TDD fix loop |
| 4 | no sound on cast | MEASURABLE | Direct fix |

-> Starting fix loop...

FASE 3: Fix Loop

Todo: markeer FASE 2 → completed, FASE 3 → in_progress.

Goal: Fix all issues using appropriate method for each type.

Process for each issue (in order):

For TESTABLE Issues: TDD Fix Loop

Step 0: Assess Fix Complexity

Before fixing, determine if research is needed:

Complexity	Example	Research?
Simple	Change a number value	No
Medium	Add new property/method	No
Complex	Refactor signal flow, add state	Yes

Complexity indicators:

SIMPLE (no research):
- "puddle radius 50 -> 100"       -> Just change the value
- "damage too low"                -> Just increase the value
- "missing sound"                 -> Just add AudioStreamPlayer
- "animation speed wrong"         -> Just adjust the speed

COMPLEX (offer research):
- "ability doesn't chain correctly"    -> Signal flow issue
- "state machine not transitioning"    -> State logic issue
- "cooldown resets unexpectedly"       -> Timer/state interaction
- "collision not detecting properly"   -> Physics layer issue
- "node references breaking"           -> Scene tree / lifecycle issue

If complex fix detected:

Use AskUserQuestion tool:

header: "Research"
question: "Dit is een complexe fix ({brief issue description}). Wil je Godot patterns researchen?"
options:
- label: "Ja, research (Aanbevolen)", description: "Research beste aanpak"
- label: "Nee, direct fixen", description: "Fix zonder research"
multiSelect: false

If research requested:

Task(subagent_type="godot-code-researcher", prompt="
Feature: {feature-name}
Fix needed: {description of issue}

Current code:
{relevant code snippet}

Problem: {what's wrong}
Goal: {what should happen}

Research GDScript patterns for this fix.
")

Use research findings to inform the fix implementation below.

Step 1-5: TDD Fix (potentially informed by research)

Generate test based on feedback:

GENERATING TEST for item {N}

Issue: {description}
Expected: {concrete value from feedback}

Write test file:

# tests/test_{feature}_{item}.gd
extends GutTest

func test_{specific_behavior}() -> void:
    # Arrange
    var {object} = {setup}

    # Act
    var result = {action}

    # Assert
    assert_eq(result, {expected_value}, "{description}")

Run test (expect FAIL):

"{godot_executable}" --headless --path . -s addons/gut/gut_cmdln.gd -gexit -gtest=res://tests/test_{feature}_{item}.gd

If test PASSES (unexpected):

UNEXPECTED: Test already passes

The test passes with current code.
Possible causes:
- Issue was already fixed
- Test doesn't capture the real problem
- Feedback was based on old version

Use AskUserQuestion tool:

header: "Test Passed"
question: "De test slaagt al. Wat wil je doen?"
options:
- label: "Overslaan (Aanbevolen)", description: "Item lijkt al gefixt, ga naar volgende"
- label: "Test aanpassen", description: "De test klopt niet, ik geef nieuwe waarden"
- label: "Handmatig checken", description: "Stop en check dit handmatig"
multiSelect: false

If test FAILS (expected):

TEST FAILS (expected)

Test: test_{specific_behavior}
Expected: {expected_value}
Actual: {actual_value}

-> Implementing fix...

Implement fix:
- Locate relevant code
- Make minimal change to satisfy test
- Document what was changed

Run test again (expect PASS):

TEST PASSES

Item {N} fixed via TDD.
Change: {description of fix}
File: {file:line}

If test still fails after fix:
- Analyze why
- Try alternative approach
- Max 3 attempts before asking user

For MEASURABLE Issues: Direct Fix

Identify code location:

DIRECT FIX for item {N}

Issue: {description}
Location: {file:line}
Current value: {current}

Apply fix directly:
- Adjust value/property
- No test possible (subjective/feel)
- Document the change

Report fix:

FIXED (cannot auto-verify)

Change: {what was changed}
From: {old value}
To: {new value}
File: {file:line}

Needs manual re-test.

After all issues processed:

FIX LOOP COMPLETE

| # | Type | Status | Change |
|---|------|--------|--------|
| 3 | TESTABLE | FIXED (test passes) | Puddle radius 50->100 |
| 4 | MEASURABLE | FIXED (needs re-test) | Added cast sound |

New tests added: 1
Files modified: 2

-> Generating re-test checklist...

FASE 4: Generate Re-test Checklist

Todo: markeer FASE 3 → completed, FASE 4 → in_progress.

Goal: Create minimal checklist for only the fixed items.

Steps:

Filter to fixed items only:
- Skip items that passed in original feedback
- Include only items that were fixed

Generate re-test checklist:

RE-TEST REQUIRED

The following items were fixed and need verification:

| # | Test | Change Made |
|---|------|-------------|
| 3 | Puddle size | Now 100px (was 50px) |
| 4 | Sound on cast | Added AudioStreamPlayer |

Play the game and verify these specific items.

Feedback format:
3:PASS
4:PASS

Or if still failing:
3:FAIL still too small
4:PASS

Wait for re-test feedback

Output:

AWAITING RE-TEST

Fixed items: {count}
Provide feedback when ready.

FASE 5: Re-test Loop

Todo: markeer FASE 4 → completed, FASE 5 → in_progress.

Goal: Process re-test feedback and loop until all pass.

Steps:

Parse re-test feedback:
- Same parsing as FASE 1
- Only expect results for fixed items

Evaluate results:

If all re-tests PASS: -> Continue to FASE 6 (Completion)

If any re-tests FAIL:

RE-TEST RESULTS

| # | Status | Notes |
|---|--------|-------|
| 3 | FAIL | still too small |
| 4 | PASS | - |

1 item still failing.

Handle persistent failures:

Use AskUserQuestion tool:
- header: "Item {N} Faalt Nog"
- question: "Item {N} werkt nog niet na fix. Wat wil je doen?"
- options:
  - label: "Meer details geven (Aanbevolen)", description: "Ik geef specifiekere feedback"
  - label: "Andere aanpak", description: "Probeer een andere fix strategie"
  - label: "Accepteren zoals het is", description: "Markeer als acceptabel voor nu"
  - label: "Handmatig fixen", description: "Stop en fix het zelf"
- multiSelect: false
Loop back to FASE 2:
- Re-categorize new feedback
- Apply new fixes
- Generate new re-test checklist
- Continue until all pass or user exits

FASE 5c: Regression Check

Todo: markeer FASE 5 → completed, FASE 5c → in_progress.

Skip when:

Geen TDD fixes toegepast in FASE 3 (alleen MEASURABLE fixes → lage kans op side effects)
Geen bestaande test files om te draaien

Doel: Verifieer dat fixes geen eerder-werkende tests hebben gebroken.

Draai gut_cmdln.gd opnieuw voor ALLE bestaande test files (niet alleen de gefixte). Vergelijk output met de FASE 0.6b baseline.

REGRESSION CHECK: {feature-name}

GUT test suite opnieuw gedraaid...

Baseline: {n} pass / {n} fail
Nu:       {n} pass / {n} fail

Nieuwe failures:
- test_{x}.gd::test_{method}: {assertion error}

Regressies: {n} | Stabiel: {n}

Geen regressies: Door naar FASE 6.

FASE 5d: Requirement Verification

Todo: markeer FASE 5c → completed, FASE 5d → in_progress.

Skip when: Alle tests FAIL (coverage check zinloos bij catastrofale failures).

Cross-check feature.json requirements tegen test resultaten:

Laad requirement → test mapping:
- Per requirements[] entry (id, description, status) — skip entries met deltaOp === "REMOVED"
- Zoek matching tests.checklist[] entries via requirementId

Bouw coverage matrix:

REQUIREMENT COVERAGE: {feature-name}

| REQ       | Beschrijving (kort)        | Tests | Status        |
|-----------|----------------------------|-------|---------------|
| REQ-001   | {eerste 40 chars}          | 2     | ✓ COVERED     |
| REQ-002   | {eerste 40 chars}          | 0     | ✗ GEEN TEST   |
| REQ-003   | {eerste 40 chars}          | 1     | ⊘ BLOCKED     |
| REQ-004   | {eerste 40 chars}          | 0     | ? UNCLEAR     |

Coverage: {covered}/{total} requirements ({percentage}%)
Non-testable: BLOCKED={n} UNCLEAR={n} (heropenen nodig)

Classificeer per requirement:
- COVERED: minstens 1 test met matching requirementId EN status PASS
- FAIL: minstens 1 test matching maar status FAIL
- BLOCKED: test bestaat niet of faalt door externe dependency (missing asset, addon niet geladen, export-preset ontbreekt)
- UNCLEAR: geen test mogelijk omdat acceptance criteria te vaag is (bv. "voelt juicy", "is fun") — niet-deterministisch
- GEEN TEST: geen test in checklist[] met matching requirementId (geen legitieme reden)
Alle requirements COVERED: toon compact samenvatting, door naar FASE 6.
Bij GEEN TEST, FAIL, BLOCKED of UNCLEAR requirements:

Per ongedekt requirement, AskUserQuestion:
```
header: "REQ niet gedekt: {REQ-ID}"
question: "{requirement description} — geen test gevonden. Wat wil je doen?"
options:
  - label: "Test toevoegen (Recommended)", description: "Schrijf een test voor dit requirement"
  - label: "Gedekt door andere test", description: "Impliciet getest via een andere test"
  - label: "Blocked door dependency", description: "Missing asset/addon/preset — niet testbaar nu"
  - label: "Criteria te vaag", description: "Acceptance criteria mist concreetheid — heropen /game-define"
multiSelect: false
```
- Test toevoegen → voeg test item toe aan tests.checklist[] met requirementId, status: "pending". Loop terug naar FASE 1 voor GUT test of FASE 2 (MANUAL) voor alleen dit item.
- Gedekt door andere test → vraag welke test het dekt. Markeer requirement met implicitCoverage: "{REQ-ID} test also validates this via {beschrijving}". Status → "PASS".
- Blocked door dependency → vraag welke dependency. Status → "BLOCKED", voeg toe aan requirements[].evidence = "blocked by: {reason}". Niet mergen-blokkerend; signaal voor heropenen na dependency-fix.
- Criteria te vaag → vraag wat vaag is. Status → "UNCLEAR", voeg toe aan requirements[].evidence = "needs clarification: {what's vague}". Signaal voor /game-define heropen om concrete acceptance te formuleren.

FASE 6: Completion

Todo: markeer FASE 5d → completed, FASE 6 → in_progress.

Goal: Sync user on fixes, capture observations, mark feature as verified and update documentation.

Step 0: Fix Sync (only when fixes were applied in FASE 3)

Skip this step if all items passed on first attempt (no fixes needed).

The Fix Sync ensures the user understands what changed in the codebase during the test-fix cycle.

0a) Claude summarizes — per fix, in plain language:

FIX SYNC: {feature-name}
=========================

{For each fix applied:}

Fix {N}: {item title}
- Problem: {what was wrong, in plain language}
- Change: {what was modified} ({file:line})
- Approach: {why this fix, not an alternative — only if non-obvious}
- Watch out: {anything the user should know — only if relevant}

{Example:}

Fix 1: Puddle too small
- Problem: Puddle radius was 50px, user expected 100px
- Change: Doubled PUDDLE_RADIUS constant (scripts/abilities/water_ability.gd:12)

Fix 2: No sound on cast
- Problem: AudioStreamPlayer was missing from the ability scene
- Change: Added AudioStreamPlayer2D with cast_sound.ogg (scenes/abilities/water_ability.tscn)
- Watch out: Sound uses AudioBus "SFX" — make sure this bus exists in project audio settings

0b) Comprehension check via AskUserQuestion:

header: "Fix Sync"
question: "Snap je de fixes die zijn toegepast?"
options:
- label: "Ja, helder (Aanbevolen)", description: "Ik begrijp wat er is veranderd en waarom"
- label: "Leg meer uit", description: "Geef een uitgebreidere uitleg met voorbeelden"
- label: "Ik heb een vraag", description: "Ik wil iets specifieks vragen"
multiSelect: false

If "Leg meer uit" → explain each fix in more detail with before/after examples, then re-ask. If "Ik heb een vraag" → answer the question, then re-ask. Loop until "Ja, helder".

0c) Save fix sync — store the summary for inclusion in feature.json tests.sessions[].

Step 0b: Out-of-scope Observations (always — even without fixes)

The user was actively playtesting and may have noticed issues outside the current feature scope. Capture these before closing out.

Use AskUserQuestion tool:

header: "Observaties"
question: "Is je tijdens het playtesten nog iets anders opgevallen buiten de scope van deze feature?"
options:
- label: "Nee, alles goed (Aanbevolen)", description: "Geen verdere opmerkingen"
- label: "Ja, ik heb iets opgemerkt", description: "Ik wil iets noteren voor later"
multiSelect: false

After documenting, show confirmation:

OBSERVATIE GENOTEERD

Opgenomen in test results.

Steps:

Confirm all items pass:

{FEATURE-NAME} COMPLETE!

All {N} playtest items passed.

| # | Test | Status |
|---|------|--------|
| 1 | {description} | PASS |
| 2 | {description} | PASS |
| 3 | {description} | PASS |
| 4 | {description} | PASS |

Feature ready for integration.

Parallel sync (feature.json + backlog + project.json + project-context.json):

Lees parallel (skip als niet bestaat):
- .project/features/{feature-name}/feature.json
- .project/backlog.html
- .project/project.json
- .project/project-context.json
Muteer in memory:

feature.json: status → "DONE", requirements[].status → "PASS" / "FAIL" / "BLOCKED" / "UNCLEAR" per item (BLOCKED/UNCLEAR includeren evidence string), tests.checklist[].status → update per item with evidence. Add/update tests sectie: finalStatus ("PASSED" alle PASS / "FAILED" ≥1 FAIL / "PARTIAL" ≥1 BLOCKED of UNCLEAR, 0 FAIL), sessions[] (push { date, pass, fail, fixes }), fixSync, verificationCheckpoint (gaps, mismatches, adjustments). Add observations[] if user reported out-of-scope issues. NIET andere secties overschrijven.

Backlog (zie shared/BACKLOG.md): zet .status = "DONE", data.updated → huidige datum.

project.json: Feature status → "DONE". Merge nieuwe packages als relevant.

project-context.json: Bij fixes in FASE 3: update architecture.components[] — merge gewijzigde bestanden naar component src/test, bevestig status: "done".

Learning Extraction — extracteer projectbrede learnings uit de voltooide feature:

Lees de zojuist geschreven feature.json en evalueer (verplichte source-tag per bron):
- build.decisions[] → type pattern, source extracted (architecturale keuzes die andere features beïnvloeden)
- tests.fixSync[] en tests.sessions[].fixes → type pitfall, source extracted (bugs met root causes)
- observations[] → type observation, source inferred (cross-feature inzichten)
Filter: alleen items die relevant zijn buiten deze ene feature. Skip feature-specifieke implementatiedetails.

Append naar project-context.json → learnings[]:
```
{
  "date": "YYYY-MM-DD",
  "feature": "{feature-name}",
  "type": "pattern|pitfall|observation",
  "source": "extracted|inferred",
  "summary": "Max 200 chars samenvatting"
}
```
Dedup voor elke candidate learning:
1. Exact shortcut: zelfde feature + zelfde summary → skip (geen Jaccard nodig)
2. Tokeniseer candidate.summary via shared/LEARNING-EXTRACTION.md Dedup Tokenizer
3. Voor elke bestaande learning in learnings[] met hetzelfde type:
  - Jaccard(candidate.tokens, existing.tokens) >= 0.55 → skip candidate
4. Overleeft beide checks → append
Geen learnings gevonden → skip.

Schrijf parallel terug:
- Write feature.json
- Edit backlog.html (keep <script> tags intact)
- Write project.json
- Write project-context.json (als context/architecture/learnings gewijzigd)
Scoped auto-commit (only this skill's changes):

Compare current git status with baseline from FASE 0:
```
git status --porcelain | sort > /tmp/current-status.txt
```
Categorize files by comparing with .project/session/pre-skill-status.txt:
- NEW (only in current, not in baseline) → git add automatically
- OVERLAP (in both baseline AND current) → warn user via AskUserQuestion: "These files had pre-existing uncommitted changes and were also modified by this skill: {list}. Include in commit?" Options: "Include (Recommended)" / "Skip"
- PRE-EXISTING (only in baseline) → do NOT stage
If baseline file doesn't exist, fall back to git add -A.
```
git commit -m "$(cat <<'EOF'
test({feature}): verified - all {N} items pass

Playtest verification complete.
- Fixed: {list of fixes}
- Tests added: {count}
EOF
)"
```
Clean up: rm -f .project/session/pre-skill-status.txt .project/session/active-{feature-name}.json /tmp/current-status.txt

Output:

VERIFICATION COMPLETE

Feature: {feature-name}
Status: DONE
Items: {N}/{N} passing

Committed: test({feature}): verified

Next steps:
  1. /game-refactor → code quality check + learnings extractie
  2. /game-define {next-feature} → volgende feature oppakken

Worktree integration hint — voeg één extra regel toe als beide voorwaarden waar zijn:

Huidige branch matcht worktree-* pattern (git branch --show-current)
Feature is na deze run op status: "DONE" in backlog

Append:

💡 Feature klaar — run /core-merge {feature-name} om te integreren naar main/develop

Output Structure

.project/features/{feature-name}/
└── feature.json    # Enriched with tests section (finalStatus, sessions, checklist updates, observations)

Example Session

User: /game-verify water-ability

Claude: PLAYTEST CHECKLIST: water-ability

        | # | Test | Expected |
        |---|------|----------|
        | 1 | Water damage visible | Enemy shows damage effect |
        | 2 | Heal effect on player | Health increases, visual feedback |
        | 3 | Puddle on ground | Circular puddle appears at impact |
        | 4 | Sound on cast | Whoosh sound plays |

        Play the game and report results.
        Format: 1:PASS/FAIL [notes]

User: 1:PASS
      2:PASS
      3:FAIL too small
      4:FAIL no sound

Claude: FEEDBACK ANALYSIS

        PASSED: 2 items (1, 2)
        FAILED: 2 items (3, 4)

        | # | Issue | Type | Action |
        |---|-------|------|--------|
        | 3 | "too small" | SUBJECTIVE | Need details |
        | 4 | "no sound" | MEASURABLE | Direct fix |

        [AskUserQuestion: "Verduidelijking Item 3"
         question: "'too small' is niet specifiek genoeg. Wat bedoel je?"
         options: "2x groter", "3x groter", "Specifieke radius (px)", "Anders"]

User: 2x groter

Claude: Now testable! Starting TDD fix loop...

        GENERATING TEST for item 3
        Issue: Puddle size should be 2x current

        [Creates test_water_ability_puddle_size.gd]
        [Runs test -> FAILS as expected]
        [Implements fix: radius * 2]
        [Runs test -> PASSES]

        DIRECT FIX for item 4
        Issue: No sound on cast
        [Adds AudioStreamPlayer to scene]

        FIX LOOP COMPLETE

        | # | Type | Status |
        |---|------|--------|
        | 3 | TESTABLE | FIXED (test passes) |
        | 4 | MEASURABLE | FIXED (needs re-test) |

        RE-TEST REQUIRED

        | # | Test | Change Made |
        |---|------|-------------|
        | 3 | Puddle size | Now 2x original |
        | 4 | Sound on cast | Added AudioStreamPlayer |

User: 3:PASS
      4:PASS

Claude: WATER-ABILITY COMPLETE!

        All 4 playtest items passed.

        Feature ready for integration.
        Committed: test(water-ability): verified

Best Practices

Language

Follow the Language Policy in CLAUDE.md. AskUserQuestion labels in user's preferred language.

TDD for Testable Issues

Always write test FIRST
Run test to confirm it fails
Make minimal fix to pass test
Tests prevent regression

Direct Fixes for Measurable Issues

Some things can't be unit tested (feel, timing, subjective)
Make the change, document it
Rely on human re-test

Clarification for Subjective Issues

Never guess what user means
Ask specific options based on context
Convert to TESTABLE or MEASURABLE before fixing

Minimal Re-test

Only ask user to re-test fixed items
Don't re-test items that already passed
Respect user's time

Restrictions

This skill must NEVER:

Skip TDD for testable issues (concrete values given)
Guess what subjective feedback means
Apply fixes without documenting changes
Mark complete if any items still failing
Re-test items that already passed
Skip clarification questions for vague feedback

This skill must ALWAYS:

Show playtest checklist before asking for feedback
Parse all feedback formats (numbered, free text)
Categorize each failure (TESTABLE/MEASURABLE/SUBJECTIVE)
Use TDD loop for testable issues
Ask clarifying questions for subjective issues
Generate re-test checklist with only fixed items
Loop until all items pass
Update documentation on completion

Path Resolution

{godot_executable} in commands wordt opgelost via paths.yaml:

macOS: /Applications/Godot.app/Contents/MacOS/Godot
Windows: C:\Godot\Godot_v4.4.1-stable_win64.exe

Override: env var CLAUDE_GODOT_EXECUTABLE of .claude/paths.local.yaml. Canonical defaults staan in skills/project-add/paths.yaml.

Todo: markeer FASE 6 → completed.

REQ	When	Then (verwacht)	Test Items	Dekking
REQ-1	vijand geraakt door aanval	vijand neemt schade	Item 1, 3	✓
REQ-2	critical hit geregistreerd	knockback wordt toegepast	—	GAP