con un clic
data-analysis
// End-to-end Stata analysis workflow — load, explore, clean, estimate, and produce publication-ready tables and figures with full logging.
// End-to-end Stata analysis workflow — load, explore, clean, estimate, and produce publication-ready tables and figures with full logging.
Combine saved Stata estimates into publication-ready tables via esttab. Produces both .tex (for paper) and .csv (for audit) with consistent formatting.
Simulate a fresh-clone reproduction of the entire pipeline and diff the new outputs against the committed ones. Catches drift before paper submission or release.
Run a holistic narrative review on a Quarto report or Markdown document. Checks reader prerequisites, worked examples, notation clarity, structural arc, and pacing.
Render a Quarto report (Stata engine) to HTML / PDF / DOCX. Performs freshness check on included tables/figures, verifies the Stata Quarto engine, and validates numerical claims before rendering.
Apply the replication protocol to a paper. Inventory the replication package, record gold-standard targets with tolerances, translate the analysis to this project's Stata pipeline, and report a tolerance-by-tolerance comparison.
Run the stata-reviewer agent on a do-file. Produces a structured code-review report covering reproducibility, logging, naming, magic numbers, table/figure quality, and conformance to project conventions.
| name | data-analysis |
| description | End-to-end Stata analysis workflow — load, explore, clean, estimate, and produce publication-ready tables and figures with full logging. |
| disable-model-invocation | true |
| argument-hint | [dataset path or analysis goal] |
| allowed-tools | ["Read","Grep","Glob","Write","Edit","Bash","Task"] |
Run a complete Stata analysis: load → EDA → clean → estimate → publication output. Produces a self-contained do-file (or a small set of stage do-files) with full logs and all artifacts in output/.
Input: $ARGUMENTS — a dataset path (e.g., data/raw/cps_2010_2020.dta) or an analysis goal (e.g., "regress wages on education with state and year FE using CPS data").
.claude/rules/stata-coding-conventions.md — version pin, log, set seed, naming, magic numbers, paths.claude/rules/econometric-best-practices.md — clustering, FE, weights, IV diagnosticsdofiles/<stage>/ with <stagenum>_<verb>_<noun>.do namingoutput/tables/ (.tex + .csv) and output/figures/ (.pdf + .png)bash scripts/run_stata.sh — never call Stata directly without the wrapperstata-reviewer agent on each new do-file before presentingdofiles/01_clean/).claude/rules/stata-coding-conventions.md for current standardsversion 17, clear all, set more off, set varabbrev off, capture log close + log using, set seed YYYYMMDD if neededuse "data/raw/<file>.dta", clear — confirm load succeeded with describe and countdata/derived/clean_<name>.dtaIn a separate exploration do-file (under explorations/<name>/dofiles/):
summarize — distributions, missingnesstabulate — categorical breakdownscorr — correlation matrix for key continuous varsxtdescribe if panelEDA logs go to explorations/<name>/logs/. EDA artifacts are NOT committed to output/.
dofiles/02_construct/)econometric-best-practices)data/derived/sample_<name>.dtadofiles/03_analysis/)reghdfe; for IV: ivreg2 + ranktest; for DiD with timing variation: prefer heterogeneity-robust estimatorsest store m_<name> after every estimationdofiles/04_output/)Tables via esttab to BOTH .tex and .csv:
esttab m_main m_alt_cluster using "output/tables/<name>.tex", replace ///
se star(* 0.10 ** 0.05 *** 0.01) booktabs label ///
stats(N r2_within mean_dep) addnotes("Cluster: <level>")
esttab m_main m_alt_cluster using "output/tables/<name>.csv", replace ///
se star(* 0.10 ** 0.05 *** 0.01) plain stats(N r2_within mean_dep)
Figures via Stata graph + graph export:
set scheme s2color
twoway ...
graph export "output/figures/<name>.pdf", replace
graph export "output/figures/<name>.png", replace width(1600)
/run-stata/validate-logstata-reviewer agent (/review-stata)*------------------------------------------------------------------------------
* File: dofiles/03_analysis/main_regression.do
* Project: [Project name]
* Author: [Name]
* Purpose: Estimate the main DiD specification on the analysis sample
* Inputs: data/derived/sample_main.dta
* Outputs: output/tables/main_regression.tex
* output/tables/main_regression.csv
* output/figures/event_study.pdf
* Log: logs/03_analysis_main_regression.log
*------------------------------------------------------------------------------
version 17
clear all
set more off
set varabbrev off
capture log close
log using "logs/03_analysis_main_regression.log", replace text
set seed 20260428
*--- 1. Load sample -----------------------------------------------------------
use "data/derived/sample_main.dta", clear
display "Sample N: " _N
*--- 2. Main spec -------------------------------------------------------------
local controls "age educ exper"
reghdfe log_wage treated##post `controls', absorb(state_id year) cluster(state_id)
estadd ysumm
estimates store m_main
*--- 3. Robustness ------------------------------------------------------------
reghdfe log_wage treated##post `controls', absorb(state_id year) cluster(state_id year)
estadd ysumm
estimates store m_twoway
*--- 4. Export ----------------------------------------------------------------
esttab m_main m_twoway using "output/tables/main_regression.tex", replace ///
se star(* 0.10 ** 0.05 *** 0.01) booktabs label ///
stats(N r2_within mean_dep) ///
addnotes("Standard errors clustered at state (col 1) and state x year (col 2).")
esttab m_main m_twoway using "output/tables/main_regression.csv", replace ///
se star(* 0.10 ** 0.05 *** 0.01) plain ///
stats(N r2_within mean_dep)
log close
local macros with comments./validate-log to catch silent failures.