Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

check-reproducibility

Name: Check Reproducibility
Author: maxwell2732

// Simulate a fresh-clone reproduction of the entire pipeline and diff the new outputs against the committed ones. Catches drift before paper submission or release.

Ejecutar en Manus

$ git log --oneline --stat

stars:79

forks:125

updated:29 de abril de 2026, 23:56

SKILL.md

readonly

name	check-reproducibility
description	Simulate a fresh-clone reproduction of the entire pipeline and diff the new outputs against the committed ones. Catches drift before paper submission or release.
disable-model-invocation	true
argument-hint
allowed-tools	["Bash","Read","Grep","Glob"]

Check Reproducibility

Run the entire pipeline as if from a fresh clone, then diff the new output/ against the committed output/. Any drift is a reproducibility failure.

When to Use

Before submitting a paper
Before tagging a release
Before merging a major branch
When onboarding a new collaborator
After any non-trivial Stata version upgrade

Steps

Pre-flight checks:
- Working tree is clean (git status shows no uncommitted changes) — otherwise the diff is meaningless. If dirty, ask the user to commit/stash first.
- data/raw/ is non-empty, OR a RAW_DATA_RESTORE_CMD is configured (e.g., a make restore-raw target or a download URL documented in data/README.md).
Snapshot current outputs:
```
cp -r output /tmp/output_snapshot
```
Clean the worktree (preserves data/raw/ since it's gitignored):
```
bash scripts/check_reproducibility.sh --clean-only
```
This wraps git clean -dfx -e data/raw -e .claude/state to wipe everything else.
Re-run the pipeline:
```
bash scripts/run_pipeline.sh
```
Capture exit code; if non-zero, the pipeline itself failed → reproducibility cannot be assessed.
Diff:
```
diff -r /tmp/output_snapshot output | head -200
```
For binary files (PDF, PNG), diff will report differences but not show them. Compare the .csv companions of any flagged tables — those are text and can be diffed cell-by-cell.
Categorize drift:
- Numerical drift in .csv tables → FAIL (the analysis is non-reproducible; investigate seed, sample order, package versions)
- Visual drift in .pdf/.png figures → typically WARN (could be font rendering, scheme, or an actual difference — open both and compare)
- Timestamp metadata only → PASS (cosmetic; many tools embed timestamps)
- No drift → PASS
Restore snapshot if drift acceptable (otherwise leave new outputs and investigate):
```
rm -rf output && mv /tmp/output_snapshot output
```
Report:
- Stages that ran + timings
- Files that differ + diff category
- Verdict: PASS / WARN / FAIL
- If FAIL: top suspects (seeded randomness, package version drift, undeclared input)

Examples

/check-reproducibility → Runs the full check on the current working tree.
/check-reproducibility after upgrading Stata to a new version → Reveals any version-sensitive results.

Troubleshooting

Pipeline fails on re-run — most common cause: a do-file references a file that exists locally but was never committed. Add it to git or document in data/README.md.
Numerical drift on bootstrap-based SE — bootstrap is reproducible only if set seed is at the top, ONCE, and bootstrap itself doesn't reseed internally. Check the do-file.
Different cluster count from reghdfe singleton drop — singleton drop depends on the order observations were merged in; if merge order is not deterministic, this can drift. Add an explicit sort before estimation.
Working tree dirty — commit or stash before running this skill.

Notes

This skill is destructive: it wipes everything except data/raw/. Triple-check that step 3 succeeded with data/raw/ intact.
Long-running: full pipeline + diff. Run when you have time, not as a quick check.
If the pipeline takes hours, consider running stage-by-stage diffs instead (compare output/tables/<stage> between runs).

related-skills.json

mismo repositorio

build-tables.md

from "maxwell2732/codex-stata-for-economists"

Combine saved Stata estimates into publication-ready tables via esttab. Produces both .tex (for paper) and .csv (for audit) with consistent formatting.

2026-04-2979

data-analysis.md

from "maxwell2732/codex-stata-for-economists"

End-to-end Stata analysis workflow — load, explore, clean, estimate, and produce publication-ready tables and figures with full logging.

2026-04-2979

pedagogy-review.md

from "maxwell2732/codex-stata-for-economists"

Run a holistic narrative review on a Quarto report or Markdown document. Checks reader prerequisites, worked examples, notation clarity, structural arc, and pacing.

2026-04-2979

render-report.md

from "maxwell2732/codex-stata-for-economists"

Render a Quarto report (Stata engine) to HTML / PDF / DOCX. Performs freshness check on included tables/figures, verifies the Stata Quarto engine, and validates numerical claims before rendering.

2026-04-2979

replicate.md

from "maxwell2732/codex-stata-for-economists"

Apply the replication protocol to a paper. Inventory the replication package, record gold-standard targets with tolerances, translate the analysis to this project's Stata pipeline, and report a tolerance-by-tolerance comparison.

2026-04-2979

review-stata.md

from "maxwell2732/codex-stata-for-economists"

Run the stata-reviewer agent on a do-file. Produces a structured code-review report covering reproducibility, logging, naming, magic numbers, table/figure quality, and conformance to project conventions.

2026-04-2979

package.json

"author": "maxwell2732"

"repository": "maxwell2732/codex-stata-for-economists"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Desarrolladores de softwareOcupaciones informáticas y matemáticas15-1252L4

Check Reproducibility

Run the entire pipeline as if from a fresh clone, then diff the new output/ against the committed output/. Any drift is a reproducibility failure.

When to Use

Before submitting a paper

Before tagging a release

Before merging a major branch

When onboarding a new collaborator

After any non-trivial Stata version upgrade

Steps

Pre-flight checks:

Working tree is clean (git status shows no uncommitted changes) — otherwise the diff is meaningless. If dirty, ask the user to commit/stash first.
data/raw/ is non-empty, OR a RAW_DATA_RESTORE_CMD is configured (e.g., a make restore-raw target or a download URL documented in data/README.md).

Snapshot current outputs:

cp -r output /tmp/output_snapshot

Clean the worktree (preserves data/raw/ since it's gitignored):

bash scripts/check_reproducibility.sh --clean-only

This wraps git clean -dfx -e data/raw -e .claude/state to wipe everything else.

Re-run the pipeline:

bash scripts/run_pipeline.sh

Capture exit code; if non-zero, the pipeline itself failed → reproducibility cannot be assessed.

Diff:

diff -r /tmp/output_snapshot output | head -200

For binary files (PDF, PNG), diff will report differences but not show them. Compare the .csv companions of any flagged tables — those are text and can be diffed cell-by-cell.

Categorize drift:

Numerical drift in .csv tables → FAIL (the analysis is non-reproducible; investigate seed, sample order, package versions)
Visual drift in .pdf/.png figures → typically WARN (could be font rendering, scheme, or an actual difference — open both and compare)
Timestamp metadata only → PASS (cosmetic; many tools embed timestamps)
No drift → PASS

Restore snapshot if drift acceptable (otherwise leave new outputs and investigate):

rm -rf output && mv /tmp/output_snapshot output

Report:

Stages that ran + timings
Files that differ + diff category
Verdict: PASS / WARN / FAIL
If FAIL: top suspects (seeded randomness, package version drift, undeclared input)

Examples

/check-reproducibility → Runs the full check on the current working tree.

/check-reproducibility after upgrading Stata to a new version → Reveals any version-sensitive results.

Troubleshooting

Pipeline fails on re-run — most common cause: a do-file references a file that exists locally but was never committed. Add it to git or document in data/README.md.

Numerical drift on bootstrap-based SE — bootstrap is reproducible only if set seed is at the top, ONCE, and bootstrap itself doesn't reseed internally. Check the do-file.

Different cluster count from reghdfe singleton drop — singleton drop depends on the order observations were merged in; if merge order is not deterministic, this can drift. Add an explicit sort before estimation.

Working tree dirty — commit or stash before running this skill.

Notes

This skill is destructive: it wipes everything except data/raw/. Triple-check that step 3 succeeded with data/raw/ intact.

Long-running: full pipeline + diff. Run when you have time, not as a quick check.

If the pipeline takes hours, consider running stage-by-stage diffs instead (compare output/tables/<stage> between runs).

check-reproducibility

Check Reproducibility

When to Use

Steps

Examples

Troubleshooting

Notes

Más de este repositorio

Más de este repositorio

Check Reproducibility

When to Use

Steps

Examples

Troubleshooting

Notes