| name | verification-before-completion |
| description | Use when about to claim work is complete, fixed, or passing, before committing, or before creating a PR. Requires fresh verification evidence for every success claim, not assumptions. For browser-visible work such as pages, components, interactions, visual states, or UI bug fixes, require relevant fresh verification evidence; invoke `playwright-interactive` only when the user explicitly asks for browser-interactive verification or browser evidence is necessary to prove the claim. |
Verification Before Completion
Overview
Claiming work is complete without verification is dishonesty, not efficiency.
Core principle: Evidence before claims, always.
Violating the letter of this rule is violating the spirit of this rule.
The Iron Law
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
If you haven't run the verification command in this message, you cannot claim it passes.
The Gate Function
BEFORE claiming any status or expressing satisfaction:
1. IDENTIFY: What evidence proves this claim?
2. CHECK SCOPE: Does the work affect anything a user sees or does in the browser?
3. RUN:
- If NO: execute the full verification commands
- If YES: execute the relevant verification commands; invoke `playwright-interactive` only when explicitly requested or necessary to prove the claim
4. READ:
- command output, exit codes, failures
- browser evidence, if collected: rendering, console, interactions, visible states
5. VERIFY: Does the evidence confirm the claim?
- If NO: State actual status with evidence
- If YES: State claim WITH evidence
6. ONLY THEN: Make the claim
Skip any step = lying, not verifying
Browser-Visible Work Verification
If the work changes anything a user can see or do in the browser, terminal-only verification may be insufficient. Choose evidence that actually proves the claim without defaulting to expensive browser automation.
Treat all of the following as browser-visible work:
- Pages, routes, layouts, styles, responsiveness, or visual states
- Components that render in the browser
- Forms, dialogs, menus, tabs, navigation, or other interactions
- Client-side data fetching, loading states, validation, or error states
- Any bug fix where the symptom is observed in the page rather than only in logs or tests
For this class of work, run the most relevant fresh verification available: component tests, E2E tests, build/typecheck, lint, targeted package scripts, or browser checks. Invoke playwright-interactive only when the user explicitly requests it, asks for browser evidence, or the visible behavior cannot be proven by existing verification.
When Browser Evidence Is Used
When browser evidence is explicitly requested or necessary, collect fresh evidence that the relevant page or flow actually works:
- The page opens and renders the expected content
- There is no white screen, missing content, or obviously broken layout
- The browser console shows no relevant errors blocking the flow
- The targeted interaction or user path works end to end
- The visible result matches the requirement after the interaction completes
If the task is about a specific browser-only fix, verify the original symptom is gone in the browser, not just that tests or builds pass.
Failure Loop For UI Work
If verification finds a rendering issue, console error, interaction failure, or wrong visible result, the task is not complete.
Stay in the fix-and-reverify loop:
- Fix the specific issue the browser evidence exposed.
- Re-run the relevant verification commands.
- Re-run
playwright-interactive on the affected page or flow only if it was explicitly requested or necessary for the evidence path.
- Continue until the browser evidence passes.
Do not treat build success, test success, or code inspection as a substitute for browser validation when browser validation is the only evidence that proves the visible requirement.
When Browser Verification Cannot Run
If browser-visible work explicitly requires playwright-interactive or browser evidence but you cannot run it yet, do not claim completion.
Examples of valid blockers:
- The local app will not start or stay up
- Required credentials, seed data, or environment variables are missing
- The target route or flow is inaccessible in the current environment
- Browser automation itself is blocked by the environment
In this case, report the real status plainly:
- What browser evidence is still missing
- Why it could not be collected
- What specific blocker must be resolved next
"Blocked pending browser verification" is honest. "Done except for Playwright" is not.
Common Failures
| Claim | Requires | Not Sufficient |
|---|
| Tests pass | Test command output: 0 failures | Previous run, "should pass" |
| Linter clean | Linter output: 0 errors | Partial check, extrapolation |
| Build succeeds | Build command: exit 0 | Linter passing, logs look good |
| Bug fixed | Test original symptom: passes | Code changed, assumed fixed |
| Regression test works | Red-green cycle verified | Test passes once |
| UI works / page fixed | Fresh relevant verification evidence; playwright-interactive evidence when explicitly requested or necessary | Code inspection, old screenshots, "looks correct" |
| Agent completed | VCS diff shows changes | Agent reports "success" |
| Requirements met | Line-by-line checklist | Tests passing |
Red Flags - STOP
- Using "should", "probably", "seems to"
- Expressing satisfaction before verification ("Great!", "Perfect!", "Done!", etc.)
- About to commit/push/PR without verification
- About to claim a UI change is complete without relevant fresh verification
- Trusting agent success reports
- Relying on partial verification
- Treating build or test success as proof that the page renders correctly when browser evidence was required
- Seeing browser errors or broken rendering and still trying to conclude the task
- Thinking "just this once"
- Tired and wanting work over
- ANY wording implying success without having run verification
Rationalization Prevention
| Excuse | Reality |
|---|
| "Should work now" | RUN the verification |
| "I'm confident" | Confidence ā evidence |
| "Just this once" | No exceptions |
| "Linter passed" | Linter ā compiler |
| "Agent said success" | Verify independently |
| "The page change was small" | Small UI changes still need relevant fresh verification |
| "The build passed so the UI is fine" | Passing build ā correct rendering or interaction |
| "I'm tired" | Exhaustion ā excuse |
| "Partial check is enough" | Partial proves nothing |
| "Different words so rule doesn't apply" | Spirit over letter |
Key Patterns
Tests:
ā
[Run test command] [See: 34/34 pass] "All tests pass"
ā "Should pass now" / "Looks correct"
Regression tests (TDD Red-Green):
ā
Write ā Run (pass) ā Revert fix ā Run (MUST FAIL) ā Restore ā Run (pass)
ā "I've written a regression test" (without red-green verification)
Build:
ā
[Run build] [See: exit 0] "Build passes"
ā "Linter passed" (linter doesn't check compilation)
Browser-visible work:
ā
[Run tests/build as relevant; if using Playwright test runner, use `--reporter=line`; invoke playwright-interactive only when requested or necessary] + [Confirm evidence matches claim] "The page change passes verification"
ā "Build passed so the page should be fine" / "Looks correct from the code"
Browser-visible work with blocker:
ā
[Try the required verification path] + [Identify the blocker] + [State what evidence is still missing] "Blocked pending verification because <specific reason>"
ā "Everything else passed so this is basically done"
Requirements:
ā
Re-read plan ā Create checklist ā Verify each ā Report gaps or completion
ā "Tests pass, phase complete"
Agent delegation:
ā
Agent reports success ā Check VCS diff ā Verify changes ā Report actual state
ā Trust agent report
Why This Matters
From 24 failure memories:
- your human partner said "I don't believe you" - trust broken
- Undefined functions shipped - would crash
- Missing requirements shipped - incomplete features
- Time wasted on false completion ā redirect ā rework
- Violates: "Honesty is a core value. If you lie, you'll be replaced."
When To Apply
ALWAYS before:
- ANY variation of success/completion claims
- ANY expression of satisfaction
- ANY positive statement about work state
- Committing, PR creation, task completion
- Moving to next task
- Delegating to agents
For browser-visible work:
- Invoke
playwright-interactive only when explicitly requested, when the user asks for browser evidence, or when browser evidence is necessary to prove the claim
- If running
playwright test, use --reporter=line unless the user explicitly requests another reporter
- Use the selected verification evidence to validate rendering, console health, and the key interaction path when those are part of the claim
- Stay in the fix-and-reverify loop until the selected evidence proves the visible behavior passes
Rule applies to:
- Exact phrases
- Paraphrases and synonyms
- Implications of success
- ANY communication suggesting completion/correctness
The Bottom Line
No shortcuts for verification.
Run the command. Read the output. THEN claim the result.
This is non-negotiable.