name	devflow-verification-gate
description	Use during the review state to prove every acceptance criterion with a captured evidence command. No "should work" claims — each AC gets a real shell/test/curl invocation whose stdout is recorded into the flow's testingInstructions. Emits the discipline-token that makes review→done possible under agent_with_discipline self-approval (DF-292).
flow_state	review
hooks	[]
discipline_token	devflow-verification-gate
ported_from	superpowers:verification-before-completion
iron_laws	["Each acceptance criterion gets a real evidence command — never claim \"should work\".","Capture stdout/stderr verbatim; truncation only at write-out, not during reasoning.","One command per criterion, deterministic, repeatable from main checkout."]

Skill: devflow-verification-gate

Iron-Law port of Superpowers verification-before-completion. The review-time guard that makes self-approval possible: prove the work, don't claim it.

When to use

In review state, in this exact order:

devflow-receiving-review — triage code-review findings (Critical / Important / Minor)
devflow-plan-reconciliation (DF-310) — reconcile plan ↔ actual implementation (done/partial/missing/extra/moved)
devflow-verification-gate (this skill) — prove every AC with a real evidence command
devflow-adr-compliance — ADR-glob path-check
devflow-knowledge-completer — close any wiki gaps
flow_update review → done with all 4 discipline-tokens

Without this exact order under agent_with_discipline, the gate blocks with discipline_incomplete.

Process

1 · Gather acceptance criteria

Pull the flow's acceptance-criteria array. Each entry needs an evidence command — the kind of command a reviewer would run to confirm it themselves, not an internal assertion.

AC kind	Typical evidence command
New API endpoint	`curl -X POST … \| jq` showing the 201 response
New backend behavior	`npm run test:api -- <suite>`
New frontend	`npx playwright test <spec>` or screenshot path
Migration	`sqlite3 data/devflow.db ".schema <table>"`
Doc-page	`curl /api/projects/X/docs/<id> \| jq .data.content`
Glob/regex helper	`vitest run <unit-test>`

If a criterion does not lend itself to a single command, split it into smaller criteria first. Do not write a "summary" command that doesn't actually verify anything.

2 · Run each command and capture output

Execute, capture full stdout + stderr, exit code, duration. Truncate to ~30 lines per command for the testing-instructions write-out (full log goes into the discipline-token evidence).

3 · Format testingInstructions

Write a markdown section per criterion:

### AC-N: <criterion text>

**Command:** `npm run test:api -- adr-compliance.test.ts`
**Exit code:** 0
**Output (truncated):**
\`\`\`
... 8 passed in 141ms ...
\`\`\`

Pass the full string as testingInstructions in the next flow_update call.

4 · Emit the discipline-token

POST /api/flows/:id/discipline-tokens
{
  "skillName": "devflow-verification-gate",
  "evidence": {
    "criteria": [
      { "id": "AC-1", "command": "...", "exitCode": 0, "outputSha256": "...", "durationMs": 141 }
    ],
    "totalCommands": <n>,
    "allPassed": true,
    "completedAt": "<iso>"
  }
}

The signed token is returned once. Pass it to flow_update along with any other required tokens:

PATCH /api/flows/:id
{
  "currentState": "done",
  "selfApproved": true,
  "disciplineTokens": ["<verification-gate-token>", "<adr-compliance-token>", "<test-suite-green-token>", "<knowledge-harvest-token>"]
}

Backend checks project_configs.allow_agent_self_approval is on, fetches pipeline_steps.required_skills for the testing step, validates every required skill has a non-expired token in flow_discipline_tokens. If yes → review→done succeeds. If no → 403 with gate.reason='discipline_incomplete' and a list of missing skills.

Iron Laws

Each acceptance criterion gets a real evidence command — never claim "should work".

Capture stdout/stderr verbatim. Truncation only at write-out, not during reasoning.

One command per criterion, deterministic, repeatable from main checkout.

If an evidence command failed (exit code != 0), stop. Either fix the issue and re-verify, or — if the failure is known/intentional — escalate to the human reviewer with the failing output. Do not paper over a red command with optimistic prose.

Output contract

flow.testing_instructions is the human-readable evidence digest.
flow_discipline_tokens row with skill_name='devflow-verification-gate' valid for 1 hour.
flows.code_approved_by is set to agent:devflow:v4.x (not the user-id) when the self-approval succeeds — audit trail clearly shows agent-driven approval.

[[migration-test-strategy]] — Säule A (skill triade) + Säule B
[[knowledge-gated-workflow]] — Stage 12 in the visual diagram
[[superpowers-migration-status]] — Live tracking
DF-292 — port flow + agent_with_discipline activation (this implementation)
DF-289 — discipline-tokens foundation
DF-290 — devflow-adr-compliance (sibling skill, also gates review→done under agent_with_discipline)
Superpowers v5.0.7 — verification-before-completion (original skill)

name	devflow-verification-gate
description	Use during the review state to prove every acceptance criterion with a captured evidence command. No "should work" claims — each AC gets a real shell/test/curl invocation whose stdout is recorded into the flow's testingInstructions. Emits the discipline-token that makes review→done possible under agent_with_discipline self-approval (DF-292).
flow_state	review
hooks	[]
discipline_token	devflow-verification-gate
ported_from	superpowers:verification-before-completion
iron_laws	["Each acceptance criterion gets a real evidence command — never claim \"should work\".","Capture stdout/stderr verbatim; truncation only at write-out, not during reasoning.","One command per criterion, deterministic, repeatable from main checkout."]

Skill: devflow-verification-gate

Iron-Law port of Superpowers verification-before-completion. The review-time guard that makes self-approval possible: prove the work, don't claim it.

When to use

In review state, in this exact order:

devflow-receiving-review — triage code-review findings (Critical / Important / Minor)
devflow-plan-reconciliation (DF-310) — reconcile plan ↔ actual implementation (done/partial/missing/extra/moved)
devflow-verification-gate (this skill) — prove every AC with a real evidence command
devflow-adr-compliance — ADR-glob path-check
devflow-knowledge-completer — close any wiki gaps
flow_update review → done with all 4 discipline-tokens

Without this exact order under agent_with_discipline, the gate blocks with discipline_incomplete.

Process

1 · Gather acceptance criteria

Pull the flow's acceptance-criteria array. Each entry needs an evidence command — the kind of command a reviewer would run to confirm it themselves, not an internal assertion.

AC kind	Typical evidence command
New API endpoint	`curl -X POST … \| jq` showing the 201 response
New backend behavior	`npm run test:api -- <suite>`
New frontend	`npx playwright test <spec>` or screenshot path
Migration	`sqlite3 data/devflow.db ".schema <table>"`
Doc-page	`curl /api/projects/X/docs/<id> \| jq .data.content`
Glob/regex helper	`vitest run <unit-test>`

If a criterion does not lend itself to a single command, split it into smaller criteria first. Do not write a "summary" command that doesn't actually verify anything.

2 · Run each command and capture output

Execute, capture full stdout + stderr, exit code, duration. Truncate to ~30 lines per command for the testing-instructions write-out (full log goes into the discipline-token evidence).

3 · Format testingInstructions

Write a markdown section per criterion:

### AC-N: <criterion text>

**Command:** `npm run test:api -- adr-compliance.test.ts`
**Exit code:** 0
**Output (truncated):**
\`\`\`
... 8 passed in 141ms ...
\`\`\`

Pass the full string as testingInstructions in the next flow_update call.

4 · Emit the discipline-token

POST /api/flows/:id/discipline-tokens
{
  "skillName": "devflow-verification-gate",
  "evidence": {
    "criteria": [
      { "id": "AC-1", "command": "...", "exitCode": 0, "outputSha256": "...", "durationMs": 141 }
    ],
    "totalCommands": <n>,
    "allPassed": true,
    "completedAt": "<iso>"
  }
}

The signed token is returned once. Pass it to flow_update along with any other required tokens:

PATCH /api/flows/:id
{
  "currentState": "done",
  "selfApproved": true,
  "disciplineTokens": ["<verification-gate-token>", "<adr-compliance-token>", "<test-suite-green-token>", "<knowledge-harvest-token>"]
}

Iron Laws

Each acceptance criterion gets a real evidence command — never claim "should work".

Capture stdout/stderr verbatim. Truncation only at write-out, not during reasoning.

One command per criterion, deterministic, repeatable from main checkout.

Output contract

flow.testing_instructions is the human-readable evidence digest.
flow_discipline_tokens row with skill_name='devflow-verification-gate' valid for 1 hour.
flows.code_approved_by is set to agent:devflow:v4.x (not the user-id) when the self-approval succeeds — audit trail clearly shows agent-driven approval.

[[migration-test-strategy]] — Säule A (skill triade) + Säule B
[[knowledge-gated-workflow]] — Stage 12 in the visual diagram
[[superpowers-migration-status]] — Live tracking
DF-292 — port flow + agent_with_discipline activation (this implementation)
DF-289 — discipline-tokens foundation
DF-290 — devflow-adr-compliance (sibling skill, also gates review→done under agent_with_discipline)
Superpowers v5.0.7 — verification-before-completion (original skill)

devflow-verification-gate

Skill: devflow-verification-gate

When to use

Process

1 · Gather acceptance criteria

2 · Run each command and capture output

3 · Format testingInstructions

4 · Emit the discipline-token

Iron Laws

Output contract

Related

Skill: devflow-verification-gate

When to use

Process

1 · Gather acceptance criteria

2 · Run each command and capture output

3 · Format testingInstructions

4 · Emit the discipline-token

Iron Laws

Output contract

Related