| name | mdproof-cli-e2e-test |
| description | Write and run E2E test runbooks that exercise the mdproof CLI itself. Use this skill whenever you need to: verify a new feature works end-to-end, validate a bug fix via a reproducible markdown runbook, test CLI flags (--dry-run, --fail-fast, --steps, --report json, --output), regression-test assertion types (substring, regex, exit_code, jq, snapshot), or confirm that parser/executor changes didn't break real-world runbook files. This skill produces .md files in runbooks/ that mdproof can run against itself — the tool testing itself. If you're about to write or run an E2E test for mdproof, use this skill first. |
| argument-hint | [feature-to-test | bug-to-reproduce | 'all'] |
| targets | ["claude","codex"] |
Write and run E2E test runbooks that validate mdproof through its own CLI. mdproof is a Markdown runbook test runner — the most natural way to E2E test it is with Markdown runbooks that exercise its features.
Scope: This skill creates .md test runbooks in runbooks/ and runs them inside the devcontainer. It does NOT write Go code (use implement-feature for that).
When to Use This
- After implementing a new feature — verify it works end-to-end
- After fixing a bug — create a regression test runbook
- Before a release — run all E2E runbooks to validate
- When Go unit tests pass but you suspect a CLI-level integration issue
Architecture
runbooks/
├── 01-basics-proof.md # Core: version, help, dry-run, pass/fail
├── 02-assertions-proof.md # All 5 assertion types (substring, regex, exit_code, negation, snapshot)
├── 03-hooks-lifecycle-proof.md # build/setup/teardown + fail-fast
├── 04-advanced-proof.md # --steps, --from, --inline, --coverage
├── 05-config-strict-proof.md # mdproof.json config, strict mode
├── 06-output-formats-proof.md # JSON, file output, verbosity levels
├── 13-source-aware-proof.md # source-aware failures + removed flag regression
└── fixtures/ # Small runbooks used as test targets
├── hello-proof.md # Single step: echo hello → "hello"
├── failing-proof.md # Intentional assertion mismatch
├── multi-step-proof.md # 3 steps with shared env (export/use)
├── with-exit-code-proof.md # exit_code: 0 and exit_code: 1
├── with-regex-proof.md # regex: pattern matching
├── with-negated-proof.md # "Should NOT contain" assertion
├── with-snapshot-proof.md # snapshot: assertion (needs -u first run)
├── with-directives-proof.md # timeout/retry/depends HTML comments
├── with-inline.md # <!-- mdproof:start/end --> markers
├── fail-fast-proof.md # Multi-step where step 1 fails
├── source-aware-assert-proof.md # stable assertion-line fixture
├── source-aware-exit-proof.md # stable command-range fixture
├── source-aware-broken.md # parser error fixture
└── verbose-proof.md # Assertions visible with -v/-v -v
File Naming Convention
mdproof's ResolveFiles() auto-discovers files matching: *_runbook.md, *-runbook.md, *_proof.md, *-proof.md. Name all runbooks and fixtures with -proof.md suffix. Exception: with-inline.md uses --inline flag which has its own discovery.
ssenv — Isolated Test Environments
ssenv creates isolated HOME directories within the devcontainer for clean test execution. Scripts are in .devcontainer/bin/ (on PATH inside the container).
ssenv create <name>
ssenv enter <name> -- <cmd>
ssrm <name>
ssls
CRITICAL: ssenv changes CWD — ssenv enter does cd $HOME (the isolated HOME), so all mdproof calls with relative paths MUST be wrapped:
ssenv enter test-foo -- mdproof runbooks/fixtures/hello-proof.md
ssenv enter test-foo -- bash -c "cd /workspace && mdproof runbooks/fixtures/hello-proof.md"
The ONE exception: commands that don't use file paths (like mdproof --version) work without wrapping:
ssenv enter test-foo -- mdproof --version
Workflow
Phase 1: Environment Check
Verify devcontainer is running and binary is built:
DEVC="docker compose -f .devcontainer/docker-compose.yml exec -w /workspace mdproof-devcontainer"
$DEVC mdproof --version || {
echo "Devcontainer not running. Start with: make devc-up"
exit 1
}
$DEVC make build
Phase 2: Determine Scope
Based on $ARGUMENTS:
- Feature name (e.g., "fail-fast") → create/update specific runbook in
runbooks/
- Reporting change (e.g., source-aware failure output) → add a focused runbook with stable line-number fixtures
- Bug description → create regression runbook as a new fixture + test step
- "all" → run all existing runbooks:
mdproof runbooks/
Phase 3: Write Test Runbook
Create a .md file that follows mdproof's own runbook format. The key pattern is meta-testing: the runbook's bash steps invoke mdproof against fixture runbooks via ssenv for isolation.
Template: Standard E2E Runbook with ssenv
# <Feature Name>
<One-line description of what this validates.>
## Steps
### Step 1: Create isolated environment
` ``bash
ssenv create test-<feature>
echo "environment ready"
` ``
Expected:
- created: test-<feature>
### Step 2: Test the feature
` ``bash
ssenv enter test-<feature> -- bash -c "cd /workspace && mdproof runbooks/fixtures/hello-proof.md"
` ``
Expected:
- 1/1 passed
### Step N: Cleanup environment
` ``bash
ssrm test-<feature>
` ``
Expected:
- deleted: test-<feature>
Template: Testing Error Cases
### Step N: Verify failure is detected
` ``bash
ssenv enter test-<feature> -- bash -c "cd /workspace && mdproof runbooks/fixtures/failing-proof.md 2>&1; echo EXIT=\$?"
` ``
Expected:
- EXIT=1
- 0/1 passed
Phase 4: Execute
Run E2E runbooks inside the devcontainer. mdproof and ssenv scripts are on PATH via Dockerfile ENV:
DEVC="docker compose -f .devcontainer/docker-compose.yml exec -w /workspace mdproof-devcontainer"
$DEVC mdproof runbooks/01-basics-proof.md
$DEVC bash -c 'mdproof runbooks/'
$DEVC bash -c 'mdproof -v runbooks/'
Phase 5: Report Results
Present results as:
E2E Results:
✓ 01-basics-proof.md — 8/8 passed
✓ 02-assertions-proof.md — 8/8 passed
✗ 04-advanced-proof.md — 7/8 passed, 1 failed
└─ Step 5: regex assertion did not match (expected ...)
If any runbook fails:
- Show the failing step's stdout/stderr
- Identify whether it's a test bug (wrong assertion) or a code bug (mdproof regression)
- Fix and re-run
Known Gotchas
Read .mdproof/lessons-learned.md before writing runbooks. It accumulates patterns and gotchas discovered through actual execution. Key ones:
--steps/--from summary counts all steps, not just selected
--from N skips earlier exports (persistent session)
- macOS host binary won't work in Linux container
- Snapshot files persist across runs
- ssenv changes CWD to isolated HOME (wrap with
cd /workspace)
- Sandbox requires files inside CWD
After running runbooks, record new discoveries in .mdproof/lessons-learned.md.
Alternative: Use mdproof sandbox for auto-container provisioning instead of manual devcontainer setup. See skills/SKILL.md § Container Safety.
Assertion Patterns for Meta-Testing
When writing runbooks that test mdproof output, use these patterns:
Exit code checking
Expected:
- exit_code: 0
JSON output with jq
Expected:
- jq: .summary.total > 0
- jq: .summary.failed == 0
- jq: .steps[0].status == "passed"
- jq: .steps | length == 3
- jq: .steps[0].source.heading.start.line == 5
- jq: .steps[0].assertions[0].source.start.line == 13
Substring in output
Expected:
- passed
- Should NOT contain error
Regex patterns
Expected:
- regex: \d+ passed
- regex: duration.*\dms
Source-aware failure output
Expected:
- FAIL runbooks/fixtures/source-aware-assert-proof.md:13 Step 1: Assertion failure
- Assertion runbooks/fixtures/source-aware-assert-proof.md:13 expected output
- Command runbooks/fixtures/source-aware-exit-proof.md:7-10
Negation (output must NOT contain)
Expected:
- Should NOT contain ERROR
- No crash
- Must NOT contain panic
Runbook Quality Checklist
Before committing a test runbook, verify:
Rules
- All execution inside devcontainer — mdproof refuses to run outside containers
- Build inside container — never use a host-built binary;
make build inside the container
- ssenv for isolation — every runbook creates/destroys its own environment
- Self-contained — each runbook must work independently, no ordering dependency
- Stable assertions — no timestamps, PIDs, or other non-deterministic values
- Clean up — ssrm at end of each runbook; temp files in
/tmp/
- Meta-test pattern — runbooks test mdproof by invoking mdproof (the binary under test)
- Report failures clearly — include stdout/stderr when a step fails