con un clic
check-reproducibility
// Simulate a fresh-clone reproduction of the entire pipeline and diff the new outputs against the committed ones. Catches drift before paper submission or release.
// Simulate a fresh-clone reproduction of the entire pipeline and diff the new outputs against the committed ones. Catches drift before paper submission or release.
Combine saved Stata estimates into publication-ready tables via esttab. Produces both .tex (for paper) and .csv (for audit) with consistent formatting.
End-to-end Stata analysis workflow — load, explore, clean, estimate, and produce publication-ready tables and figures with full logging.
Run a holistic narrative review on a Quarto report or Markdown document. Checks reader prerequisites, worked examples, notation clarity, structural arc, and pacing.
Render a Quarto report (Stata engine) to HTML / PDF / DOCX. Performs freshness check on included tables/figures, verifies the Stata Quarto engine, and validates numerical claims before rendering.
Apply the replication protocol to a paper. Inventory the replication package, record gold-standard targets with tolerances, translate the analysis to this project's Stata pipeline, and report a tolerance-by-tolerance comparison.
Run the stata-reviewer agent on a do-file. Produces a structured code-review report covering reproducibility, logging, naming, magic numbers, table/figure quality, and conformance to project conventions.
| name | check-reproducibility |
| description | Simulate a fresh-clone reproduction of the entire pipeline and diff the new outputs against the committed ones. Catches drift before paper submission or release. |
| disable-model-invocation | true |
| argument-hint | |
| allowed-tools | ["Bash","Read","Grep","Glob"] |
Run the entire pipeline as if from a fresh clone, then diff the new output/ against the committed output/. Any drift is a reproducibility failure.
Pre-flight checks:
git status shows no uncommitted changes) — otherwise the diff is meaningless. If dirty, ask the user to commit/stash first.data/raw/ is non-empty, OR a RAW_DATA_RESTORE_CMD is configured (e.g., a make restore-raw target or a download URL documented in data/README.md).Snapshot current outputs:
cp -r output /tmp/output_snapshot
Clean the worktree (preserves data/raw/ since it's gitignored):
bash scripts/check_reproducibility.sh --clean-only
This wraps git clean -dfx -e data/raw -e .claude/state to wipe everything else.
Re-run the pipeline:
bash scripts/run_pipeline.sh
Capture exit code; if non-zero, the pipeline itself failed → reproducibility cannot be assessed.
Diff:
diff -r /tmp/output_snapshot output | head -200
For binary files (PDF, PNG), diff will report differences but not show them. Compare the .csv companions of any flagged tables — those are text and can be diffed cell-by-cell.
Categorize drift:
.csv tables → FAIL (the analysis is non-reproducible; investigate seed, sample order, package versions).pdf/.png figures → typically WARN (could be font rendering, scheme, or an actual difference — open both and compare)Restore snapshot if drift acceptable (otherwise leave new outputs and investigate):
rm -rf output && mv /tmp/output_snapshot output
Report:
/check-reproducibility
→ Runs the full check on the current working tree.
/check-reproducibility after upgrading Stata to a new version
→ Reveals any version-sensitive results.
data/README.md.set seed is at the top, ONCE, and bootstrap itself doesn't reseed internally. Check the do-file.reghdfe singleton drop — singleton drop depends on the order observations were merged in; if merge order is not deterministic, this can drift. Add an explicit sort before estimation.data/raw/. Triple-check that step 3 succeeded with data/raw/ intact.output/tables/<stage> between runs).