| name | Full-empirical-analysis-skill-R |
| description | Classical end-to-end empirical analysis workflow in the modern tidyverse + econometrics R ecosystem — dplyr + tidyr + haven + fixest + sandwich + lmtest + clubSandwich + AER + ivreg + did + bacondecomp + HonestDiD + eventstudyr + rdrobust + rddensity + Synth + gsynth + synthdid + MatchIt + WeightIt + cobalt + ebal + grf + DoubleML + mediation + marginaleffects + modelsummary + kableExtra + gt + ggplot2 + ggpubr + cowplot + binsreg. **Defaults to economics empirical-paper style** (AER / QJE / AEJ) — every run produces a publication-ready output set with a multi-column regression table (M1→M6 progressive controls/FE) as the centerpiece, plus Table 1 (descriptives), mechanism / heterogeneity / robustness tables, and event-study + coefficient + trend figures. Covers the full 8-step R pipeline an applied economist runs on every paper — (1) data import & cleaning (read_dta/read_csv, naniar, janitor, validate-merges), (2) variable construction (mutate/across/winsorize/group_by + lag/lead with dplyr), (3) descriptive statistics & Table 1 (gtsummary, modelsummary::datasummary, tableone), (4) classical diagnostic tests (shapiro/jarque.bera.test/bptest/dwtest/bgtest/vif/adf.test/kpss.test/Hausman), (5) baseline modeling (fixest::feols, ivreg, did::att_gt, eventstudyr, sun_ab, did_imputation, synthdid, rdrobust, MatchIt, WeightIt, grf::causal_forest, DoubleML, mediation), (6) robustness battery (modelsummary stack, clubSandwich CRSE, fwildclusterboot, ri2, robomit Oster, bacondecomp, HonestDiD), (7) further analysis (interactions + marginaleffects, mediation::mediate, gsem via lavaan, dose-response splines, grf CATE), (8) publication-ready tables & figures (modelsummary, kableExtra, gt, stargazer, texreg, flextable to LaTeX/Word/HTML; ggplot2 + ggpubr + cowplot + binsreg + iplot for figures). **Also covers two parallel domain modes that share the same 8-step scaffolding** — **Mode A — Epidemiology / public health** (target-trial emulation, IPTW + g-formula + TMLE doubly-robust triplet via `WeightIt` / `gfoRmula` / `tmle` / `ltmle`, Mendelian randomization via `MendelianRandomization` / `TwoSampleMR` / `MRPRESSO`, KM / Cox / AFT / RMST survival via `survival` / `survminer` / `flexsurv`, E-value sensitivity via `EValue`, principal stratification — STROBE / TRIPOD reporting), and **Mode B — ML causal inference** (DML via `DoubleML`, S/T/X/R/DR meta-learners via `causalweight` / `grf`, causal forest via `grf::causal_forest`, BART/BCF via `bartCause` / `bcf`, matrix completion via `MCPanel`, CATE distribution + policy tree via `policytree`, off-policy evaluation, conformal causal via `conformalInference` / `cfcausal`, fairness audit via `fairmodels`, DAG learning via `pcalg` / `bnlearn` / LLM-assisted). Use when the user asks for a complete R empirical analysis, wants a tidyverse-style reproducible R script / Quarto workflow, prefers fixest over reghdfe, needs the R counterpart to StatsPAI / 00.1 / 00.2, or names a specific R step in isolation ("feols with cluster", "MatchIt nearest neighbor", "bacondecomp in R", "gtsummary table 1", "modelsummary to Word"). Mode A triggers on "target trial emulation R", "tmle ltmle", "MendelianRandomization", "TwoSampleMR", "MRPRESSO", "survival cox AFT", "STROBE R", "EValue R", "公共健康 R", "流行病学 R". Mode B triggers on "DoubleML R", "grf causal forest", "policytree", "bartCause bcf", "conformal causal R", "fairmodels", "pcalg NOTEARS", "因果机器学习 R". |
| triggers | ["R empirical analysis","tidyverse econometrics workflow","reproducible R script","Quarto empirical pipeline","fixest feols feglm fepois","high-dimensional fixed effects R","clubSandwich cluster-robust","fwildclusterboot wild cluster bootstrap","ivreg AER 2SLS R","did att_gt Callaway SantAnna R","eventstudyr event study R","did_imputation Borusyak R","synthdid R package","bacondecomp R Goodman Bacon","HonestDiD R Rambachan Roth","rdrobust R","rddensity R","Synth gsynth R","MatchIt nearest neighbor R","WeightIt IPW propensity R","cobalt balance check R","ebal entropy balancing R","grf causal forest R","DoubleML R","mediation R Imai","marginaleffects R","gtsummary table 1","modelsummary publication table","kableExtra LaTeX","texreg stargazer","flextable Word","ggplot2 coefplot","iplot fixest","binsreg R","haven read_dta sav","janitor clean_names","naniar missing","epidemiology pipeline R","public health causal inference R","target trial emulation R","g-formula R gfoRmula","IPTW marginal structural model R","WeightIt PSweight","tmle ltmle doubly robust","HAL-TMLE R","Mendelian randomization R","MendelianRandomization package","TwoSampleMR","MRPRESSO","MR-Egger weighted median R","STROBE TRIPOD reporting R","EValue sensitivity R","Kaplan-Meier AFT survival R","survival survminer flexsurv","流行病学 R","公共健康 R","ML causal inference R","DoubleML R","grf causal forest R","meta-learner S T X R DR R","causalweight R","bartCause bcf","Bayesian causal forest BCF R","CATE distribution R","policytree R","off-policy evaluation R","conformalInference cfcausal","conformal causal prediction R","fairmodels fairness audit","causal discovery PC NOTEARS R","pcalg bnlearn","因果机器学习 R"] |
Full Empirical Analysis — Classical R Workflow
This skill is the canonical 8-step pipeline an applied economist runs on every empirical paper, written in the modern tidyverse + econometrics R ecosystem — dplyr/tidyr/haven for data, fixest as the panel/IV/DID workhorse, did/bacondecomp/HonestDiD for modern DID, rdrobust/rddensity for RD, Synth/gsynth/synthdid for synthetic control, MatchIt/WeightIt/cobalt/ebal for matching, grf/DoubleML for ML causal, mediation for causal mediation, marginaleffects for post-estimation, modelsummary/kableExtra/gt for publication tables, ggplot2/iplot/binsreg for figures.
Companion skills: this is the R sibling of 00-StatsPAI_skill (Python DSL), 00.1-Full-empirical-analysis-skill (explicit Python), and 00.2-Full-empirical-analysis-skill_Stata (Stata .do). All four implement the same 8 steps, in their respective ecosystems.
Philosophy
- Tidyverse + fixest, the modern R idioms.
feols(... | unit + year, cluster = ~unit), not Frankenstein-y lm(y ~ x + factor(unit) + factor(year)).
- Reproducible scripts / Quarto. Every example below is paste-runnable.
renv for package locking; Quarto (.qmd) for combined narrative + code + tables/figures.
- 8 steps, first-class. R users historically over-invest in Step 5; this skill treats Steps 1–4 and 6–8 as core.
- Rich outputs. Every step yields at least one table or figure — tex/docx/png/pdf.
- Progressive disclosure.
SKILL.md gives the canonical call per step; references/ holds variant-specific depth.
Three domain modes (default = AER econ; alternates = epi & ML-causal)
The default playbook above is AER-style applied econometrics — the AEA convention: written-out estimating equation, identifying assumption, design horse-race, full robustness gauntlet. The skill also ships two parallel sub-pipelines for the other two big causal-inference traditions, each reusing the same Steps 1–4 (cleaning / construction / Table 1 / diagnostics) and Step 8 (tables/figures) — only Step 5 (estimator) and Step 6/7 swap packages:
| Mode | Reader convention | Step-5 estimator stack | Reporting stack | Jump to |
|---|
| Default — Applied Econ (AER / QJE / AEJ) | "Show the equation + identifying assumption + design horse-race; controls visible; clustered SE" | DID / IV / RD / SCM / matching / fixest::feols HDFE | AER house-style multi-column modelsummary + kableExtra / gt / flextable + 8-section paper layout | Steps 1 → 8 (entire playbook below) |
| Mode A — Epidemiology / Public Health | "STROBE / TRIPOD-AI; target trial protocol; doubly-robust estimand; absolute & relative risk; KM survival" | Target-trial emulation · IPTW (WeightIt / PSweight) · g-formula (gfoRmula) · TMLE (tmle / ltmle) · Mendelian randomization (MendelianRandomization / TwoSampleMR / MRPRESSO) · KM / Cox / AFT (survival / survminer / flexsurv) | Same modelsummary + risk-difference / hazard-ratio / E-value rows | §A. Epidemiology pipeline |
| Mode B — ML Causal Inference | "DML / meta-learners / causal forest / DR-learner; CATE distribution; policy value" | DML (DoubleML) · S/T/X/R/DR-Learner (causalweight / grf) · GRF causal forest (grf::causal_forest) · BART/BCF (bartCause / bcf) · matrix completion (MCPanel) | modelsummary ML horse-race + grf CATE plot + policy-value table + conformalInference PI | §B. ML causal pipeline |
How to invoke a non-default mode (Claude / agent picks this up from the user's wording):
| User says... | Mode the skill switches to |
|---|
| "Run a DID / IV / RD / event study", "AER table", "applied micro" | Default (AER econ) — Steps 1 → 8 |
| "Target trial emulation", "g-formula", "IPTW", "TMLE", "Mendelian randomization", "STROBE / TRIPOD", "公共健康 / 流行病学", "epi pipeline", "RWE study", "cohort study", "case-control" | Mode A (Epi) — §A |
| "DML", "double machine learning", "causal forest", "meta-learner", "CATE", "BCF", "policytree", "policy learning", "conformal causal", "fairness audit", "ML causal", "uplift modeling", "因果机器学习" | Mode B (ML causal) — §B |
| "Mix" (e.g. "estimate DID + then ML CATE on the heterogeneity") | Default + Mode B in sequence — every estimator yields a coefficient + SE pair, drop them all into one modelsummary(...) for the horse-race column |
The three modes share the same Step 1–4 cleaning / Table 1 / diagnostics scaffolding, the same Step 8 export stack, and the same DAG-first identification logic — switching modes only changes which Step-5 estimator family you reach for, not the surrounding paper structure. If you only want descriptive stats / Table 1 / a balance check, the AER gtsummary::tbl_summary / modelsummary::datasummary_balance calls in Step 3 work identically across all three modes.
Default Output Spec — Economics Empirical Paper
This skill defaults to the applied-economics paper convention. Unless the user explicitly asks for a single point estimate, every run produces the full publication-ready output set below. Treat it as the contract of Step 8 — mandatory, not opt-in.
Required tables (always produced)
| # | Table | R source | Saves to |
|---|
| T1 | Summary statistics & balance (treated vs control, with SMD / p-values) | gtsummary::tbl_summary + add_p + add_difference (Step 3) | tables/table1_balance.xlsx + .docx + .tex |
| T2 ★ | Main results — multi-column regression M1→M6 (progressive controls + FE) | fixest::feols × 6 specs → modelsummary (Step 5–6) | tables/table2_main.xlsx + .docx + .tex |
| T3 | Mechanism / outcome ladder — same treatment, 3+ outcomes side-by-side | loop feols over y ∈ {Y1, Y2, Y3, Y_main} → modelsummary (Step 7) | tables/table3_mechanism.xlsx + .docx + .tex |
| T4 | Heterogeneity — subgroup × main coef (gender, age, region, …) | subgroup feols × linearHypothesis → modelsummary (Step 7) | tables/table4_heterogeneity.xlsx + .docx + .tex |
| T5 | Robustness battery — alt SE / cluster / sample / placebo, in one table | feols × variants → modelsummary (Step 6) | tables/table5_robustness.xlsx + .docx + .tex |
★ Table 2 is the centerpiece of every economics paper. It is the multi-column regression table that walks the reader from raw correlation (M1) to the fully-specified design (M6: 2-way FE + interacted FE + cluster-robust SE). Do not collapse it into a single column. Do not report only the headline coefficient. The progression is the credibility argument: if M1→M6 is monotone and stable, the design is plausibly identifying; if it collapses on adding FE, that is the result.
Canonical 6 columns, in order:
- M1 raw bivariate (
feols(y ~ treat, data))
- M2 + demographics (
+ age + edu)
- M3 + sector controls (
+ tenure / firm_size)
- M4 + unit FE (
| worker_id)
- M5 + 2-way FE (
| worker_id + year)
- M6 + interacted FE (
| worker_id + year + industry^year) with cluster = ~ worker_id
Required figures (always produced)
| # | Figure | R source | Saves to |
|---|
| F1 | Trend / motivation — treated vs control over time, with policy line | dplyr group means → ggplot + geom_line (Step 3) | figures/fig1_trend.png (300 dpi, 必须导出 PNG) + .pdf |
| F2 | Event-study coefficients with 95% CI, base period at –1 | fixest::sunab() / did::ggdid / iplot (Step 5) | figures/fig2_event_study.png (300 dpi, 必须导出 PNG) + .pdf |
| F3 | Coefficient plot across specs M1→M6 | modelsummary::modelplot() (Step 8) | figures/fig3_coefplot.png (300 dpi, 必须导出 PNG) + .pdf |
| F4 | Robustness / sensitivity — bacondecomp::bacon plot, HonestDiD::createSensitivityPlot, or spec curve | scenario-specific (Step 6) | figures/fig4_sensitivity.png (300 dpi, 必须导出 PNG) + .pdf |
Output file layout (default)
project/
├── tables/ table1_balance.xlsx/.docx/.tex table2_main.xlsx/.docx/.tex
│ table3_mechanism.xlsx/.docx/.tex table4_heterogeneity.xlsx/.docx/.tex
│ table5_robustness.xlsx/.docx/.tex
└── figures/ fig1_trend.png(300dpi)+.pdf fig2_event_study.png(300dpi)+.pdf
fig3_coefplot.png(300dpi)+.pdf fig4_sensitivity.png(300dpi)+.pdf
关键输出规则(必须遵守):
- 图片格式:所有图片必须同时导出 PNG 格式(≥300 dpi) 和 PDF 格式(用于 LaTeX 排版)
- 表格格式:所有回归表格必须同时导出 Excel(.xlsx)、Word(.docx) 和 LaTeX(.tex) 三种格式
- PNG 用于幻灯片、Markdown 文档、邮件等场景;PDF 用于学术论文排版
When to deviate
- Single quick estimate — produce only the relevant cell, but warn that the standard deliverable is the full set above and offer to run it.
- Design does not support a figure (cross-section → no event study) — skip with a printed
message() explaining why; do not silently drop.
- N=1 treated unit (
Synth / synthdid) — replace F1/F2 with the SCM trajectory + placebo distribution; T1–T5 still apply.
Required packages
install.packages(c(
"tidyverse", "haven", "readxl", "data.table", "janitor",
"naniar", "VIM", "mice", "validate",
"gtsummary", "tableone", "modelsummary", "kableExtra", "gt",
"stargazer", "texreg", "flextable", "psych", "summarytools",
"lmtest", "sandwich", "car", "tseries", "urca", "plm",
"clubSandwich", "fwildclusterboot",
"fixest",
"AER",
"ivreg",
"did",
"didimputation",
"fixest",
"synthdid",
"bacondecomp", "HonestDiD",
"DIDmultiplegtDYN",
"rdrobust", "rddensity", "rdmulti",
"Synth", "gsynth", "tidysynth",
"MatchIt", "WeightIt", "cobalt", "ebal",
"grf", "DoubleML",
"mediation", "lavaan",
"robomit",
"ri2", "ritools",
"multcomp",
"marginaleffects",
"ggplot2", "ggpubr", "cowplot", "patchwork",
"binsreg",
"ggdist", "ggrepel"
))
The 8 Steps — Canonical Pipeline (mapped to AER paper sections)
┌──────────────────────────────────────────────────────────────────────┐
│ Step −1 Pre-Analysis Plan (PAP) pwr / WebPower / DeclareDesign │
│ Step 0 Sample log + data contract sample_log/stopifnot/jsonlite │
│ Step 1 Data import & cleaning read_csv/read_dta/janitor/naniar/mice│
│ Step 2 Variable construction mutate/across/winsorize/lag/group_by │
│ Step 2.5 Empirical strategy equation × ID assumption + pre-reg │
│ Step 3 Descriptive statistics gtsummary/datasummary_balance/cor_pmat│
│ Step 3.5 Identification graphics iplot/binsreg/rdplot/cobalt/Synth │
│ Step 4 Diagnostic tests shapiro/bptest/dwtest/vif/adf/kpss │
│ Step 5 Baseline modeling feols/ivreg/att_gt/synthdid/MatchIt │
│ Step 6 Robustness battery bacondecomp/HonestDiD/fwildclusterboot│
│ Step 7 Further analysis marginaleffects/mediation/grf │
│ Step 8 Tables & figures modelsummary/iplot/ggplot2/cowplot │
└──────────────────────────────────────────────────────────────────────┘
The 8 steps mirror the canonical sections of an applied AER / QJE / AEJ paper. Each step is one paper section and emits a paper-ready artifact on disk:
Paper section Step R moves
─────────────────────────── ───── ────────────────────────────────────────────────
Pre-Analysis Plan −1 pwr / WebPower / DeclareDesign + freeze pap.json
§1. Data 0 sample_log + 5-check stopifnot → JSON via jsonlite
§1. Data 1 haven::read_dta · janitor::clean_names · naniar/mice
§1. Data 2 mutate/across/Winsorize/lag/lead/diff · CPI deflate
§1.1 Descriptives (Table 1) 3 gtsummary::tbl_summary · datasummary_balance
§2. Empirical Strategy 2.5 write equation + ID assumption → strategy.md
§3. Identification graphics 3.5 fixest::iplot · binsreg · rdplot · cobalt::love.plot · Synth
§3.5 Diagnostics 4 bptest · dwtest · car::vif · urca::ur.df · phtest
§4. Main Results (Table 2) 5 fixest::feols progressive (m1...m6) · modelsummary
§5. Heterogeneity (Table 3) 7 feols(... + i(.):X) · marginaleffects::avg_slopes
§6. Mechanisms / Channels 7 mediation::mediate · lavaan · outcome ladder
§7. Robustness gauntlet 6 bacondecomp · HonestDiD · robomit · fwildclusterboot · ri2
§8. Replication package 8 modelsummary("...tex") · gt → docx · result.json
Below is the canonical call at each step. All examples share one running narrative — labor-econ panel where training (treatment) affects log_wage (outcome), with covariates age, edu, tenure, panel keys worker_id/firm_id/year. Variable names and parameter values are illustrative.
When a step has many variants (5 staggered-DID estimators; 4 hetero tests), SKILL.md shows the one you reach for first; deeper variants live in references/NN-<topic>.md.
Paper-ready figure & table inventory (what to produce by section)
A modern AER paper has 5–7 figures and 3–5 main tables + an appendix robustness table. Every step below leaves at least one numbered artifact on disk. Default file names assume parallel .tex / .docx / .xlsx exports (the agent should produce all three so co-authors can edit in Word, the build system can use LaTeX, and editors can edit raw numbers in Excel). 所有图片必须同时保存 PNG(≥300 dpi)和 PDF 两种格式。
| § | Artifact | R primitive | Filenames |
|---|
| §1 | Figure 1: raw trends / treatment rollout | df %>% group_by(year, treat) %>% summarise(mean(y)) %>% ggplot() | figures/fig1_trend.png(300dpi)+.pdf |
| §1 | Table 1: summary stats (full / treated / control + Δ + SMD) | gtsummary::tbl_summary · modelsummary::datasummary_balance | tables/table1_balance.xlsx/.docx/.tex |
| §3 | Figure 2: identification graphic (event-study / first-stage / McCrary / RD scatter / SCM trajectory) | fixest::iplot(es) · binsreg · rdrobust::rdplot · rddensity · Synth::path.plot | figures/fig2_event_study.png(300dpi)+.pdf |
| §4 | Table 2: main results — progressive controls M1→M6 | modelsummary(list("(1)"=m1,...,"(6)"=m6)) · fixest::etable | tables/table2_main.xlsx/.docx/.tex |
| §4 | Table 2-bis: design horse-race (OLS / IV / DID / DML) | modelsummary(list("OLS"=ols, "2SLS"=iv, "CS-DID"=cs, "DML"=dml)) | tables/table2b_designs.xlsx/.docx/.tex |
| §4 | Figure 3: coefficient plot across specs | modelplot(list(m1,...,m6), coef_map="training") | figures/fig3_coefplot.png(300dpi)+.pdf |
| §5 | Table 3: heterogeneity by subgroup | modelsummary(g_full, g_male, g_fem, g_q1, ..., g_q4) | tables/table3_heterogeneity.xlsx/.docx/.tex |
| §5 | Figure 4: dose-response / CATE | marginaleffects::plot_predictions · grf::plot.causal_forest | figures/fig4_cate.png(300dpi)+.pdf |
| §6 | Table 4: mechanism / outcome ladder | loop feols over outcomes → modelsummary | tables/table4_mechanism.xlsx/.docx/.tex |
| §7 | Table A1: robustness master (one column per check) | modelsummary(list(base, no99, balpan, dropearly, wfe, cl2way, logy, ihsy, psm, ebal)) | tables/tableA1_robustness.xlsx/.docx/.tex |
| §7 | Figure 5: spec curve | specr::specr() + plot_specs (or hand-rolled purrr::pmap) | figures/fig5_spec_curve.png(300dpi)+.pdf |
| §7 | Figure 6: sensitivity (HonestDiD / Oster / E-value) | HonestDiD::createSensitivityPlot · robomit::o_test · EValue | figures/fig6_sensitivity.png(300dpi)+.pdf |
| §8 | Replication bundle: all tables in one document | modelsummary(..., output="docx") · gt::gtsave() · Quarto / Rmd | replication/paper_tables.xlsx/.docx/.tex |
Every R estimator above (fixest::feols / AER::ivreg / did::att_gt / grf::causal_forest / synthdid_estimate) returns a result object that can be passed straight into modelsummary(...) / modelplot(...) / etable(...). Don't hand-roll LaTeX from kable(), and don't render Word via flextable directly — modelsummary, etable, and gtsummary apply book-tab borders, AER stars, and the right SE label automatically. For deeper export recipes, see references/08-tables-plots.md.
Export cookbook — LaTeX / Word / Excel in one block
关键规则(必须遵守):每个表格必须同时导出三种格式——Excel(.xlsx)、Word(.docx)、LaTeX(.tex)。每个图片必须同时保存PNG(≥300dpi)和PDF两种格式。
R has the best publication-table ecosystem of the three languages. Three tiers, picked by scope:
| Tier | Use when | API | Hot args |
|---|
| 1. Single multi-column table | Exporting one Table 2 / Table 3 / Table A1 with progressive columns | `modelsummary(list("(1)"=m1,...,"(N)"=mN), output="tables/tab.tex", stars=c(""=.1,""=.05,""=.01), gof_omit="BIC | AIC |
| 2. Multi-panel paper format (Tables 2 + 3 + A1 + A2 in one file) | Producing the paper-tables block — main + heterogeneity + robustness + placebo as a single document | modelsummary chained with gt::gt_group() for one document with section headers, OR Quarto .qmd rendering multiple modelsummary calls between prose | gt_group(modelsummary(...), modelsummary(...)) · quarto render paper.qmd |
3. Full session bundle (the Stata collect / Python Stargazer + pylatex equivalent) | Replication appendix that mixes summary stats + balance + multiple regression tables + headings + prose in one file | Quarto is the modern R-native answer. master.qmd interleaves prose + chunks that emit modelsummary / gtsummary / ggplot2 outputs; one quarto render produces .pdf / .docx / .html | YAML front matter sets format: [pdf, docx, html] for triple-target output |
Journal styling — pick the right stars and SE label. The AEA convention is c("*"=.1, "**"=.05, "***"=.01) and notes = "Cluster-robust standard errors in parentheses...". Define a wrapper once at the top of master.R:
aer_table <- function(models, output, headers = NULL, coef_map = NULL) {
base <- tools::file_path_sans_ext(output)
for (ext in c(".xlsx", ".docx", ".tex")) {
output_file <- paste0(base, ext)
fmt <- if (ext == ".xlsx") "html" else if (ext == ".docx") "docx" else "latex"
modelsummary(
models,
output = output_file,
stars = c("*" = 0.1, "**" = 0.05, "***" = 0.01),
gof_omit = "BIC|AIC|F|Log|Adj",
coef_map = coef_map,
notes = paste("Cluster-robust standard errors in parentheses.",
"* p<0.10, ** p<0.05, *** p<0.01."),
output_format = fmt
)
}
}
For the multi-panel .docx / .xlsx and Quarto cookbook (single-file paper-tables bundle), see references/08-tables-plots.md.
Step −1 — Pre-Analysis Plan (pre-data; AEA RCT Registry style)
Before touching the data, write down (a) the population, (b) the design, (c) the minimum detectable effect (MDE) under the planned sample size and α=0.05, β=0.20. Persist the result as pap.json so a referee can verify the design was powered before, not after, the data were seen.
library(pwr)
library(WebPower)
library(jsonlite)
pwr.t.test(d = 0.20, power = 0.80, sig.level = 0.05,
type = "two.sample", alternative = "two.sided")
pwr.t.test(n = 2000, power = 0.80, sig.level = 0.05,
type = "two.sample")$d
WebPower::wp.crt2arm(f = 0.20, J = NULL, n = 50, icc = 0.05, power = 0.80,
alpha = 0.05, alternative = "two.sided")
pap <- list(
population = "manufacturing workers, 2010–2020",
treatment = "training (binary, staggered adoption)",
outcome = "log_wage",
estimand = "ATT",
design = "staggered DID, Callaway-Sant'Anna",
alpha = 0.05,
power_target = 0.80,
mde_d = 0.20,
n_planned = 12000,
frozen_at = "2026-01-15T09:00:00Z",
git_sha = "<paste>"
)
write_json(pap, "artifacts/pap.json", pretty = TRUE, auto_unbox = TRUE)
For richer DAG-aware power analysis (write down the DAG, declare estimands, simulate the design), use DeclareDesign — it is the R-native equivalent of EGAP's pre-analysis flow.
Commit artifacts/pap.json in the repo before Step 1. AEA RCT Registry / OSF preregistration tools accept it as the analysis-plan exhibit.
Step 0 — Sample-construction log & 5-check data contract
An AER §1 Data section has three jobs: (a) describe sources, (b) document every sample restriction (the "footnote 4" sample log), (c) lock the panel structure.
0.1 Sample-construction log (footnote 4)
library(tidyverse); library(jsonlite)
sample_log <- tibble::tibble(step = character(), n = integer())
df_raw <- read_dta("raw/panel.dta") %>% janitor::clean_names()
sample_log <- sample_log %>% add_row(step = "0. raw", n = nrow(df_raw))
df1 <- df_raw %>% drop_na(wage)
sample_log <- sample_log %>% add_row(step = "1. drop missing wage", n = nrow(df1))
df2 <- df1 %>% filter(between(age, 18, 65))
sample_log <- sample_log %>% add_row(step = "2. drop age outside 18-65", n = nrow(df2))
df3 <- df2 %>% filter(industry %in% c("manuf", "construction", "transport"))
sample_log <- sample_log %>% add_row(step = "3. keep target industries", n = nrow(df3))
df <- df3
print(sample_log)
write_json(sample_log, "artifacts/sample_construction.json", pretty = TRUE)
Paste the printed tibble verbatim as footnote 4 of the paper.
0.2 Five-check data contract (go / no-go gate)
library(validate); library(assertr)
data_contract <- function(df, y, treatment, id = NULL, time = NULL, covariates = c()) {
keys <- c(y, treatment, id, time, covariates)
contract <- list(
n_obs = nrow(df),
dtypes = sapply(df[keys], function(x) class(x)[1]),
n_missing = sapply(df[keys], function(x) sum(is.na(x))),
n_dupes_on_keys = if (!is.null(id) && !is.null(time))
sum(duplicated(df[, c(id, time)])) else 0,
panel_balanced = NULL,
cohort_sizes = NULL
)
if (!is.null(id) && !is.null(time)) {
bal <- df %>% count(.data[[id]])
contract$panel_balanced <- all(bal$n == max(bal$n))
contract$n_dropped_by_balance <- sum(bal$n != max(bal$n))
if ("first_treat" %in% names(df)) {
contract$cohort_sizes <- df %>% distinct(.data[[id]], .keep_all = TRUE) %>%
count(first_treat) %>% deframe()
}
}
contract$y_range <- range(df[[y]], na.rm = TRUE)
contract$treatment_share <- mean(df[[treatment]], na.rm = TRUE)
miss_y <- is.na(df[[y]])
contract$mcar_hint <- "likely MCAR (listwise OK)"
if (any(miss_y) && any(!miss_y)) {
for (cov in covariates) {
if (is.numeric(df[[cov]])) {
p <- t.test(df[[cov]][miss_y], df[[cov]][!miss_y])$p.value
if (p < 0.05) {
contract$mcar_hint <- sprintf("NOT MCAR (y-miss differs on %s, p=%.3f) → use mice / IPW",
cov, p)
break
}
}
}
}
contract
}
contract <- data_contract(df, y = "wage", treatment = "training",
id = "worker_id", time = "year",
covariates = c("age", "edu", "tenure"))
stopifnot(contract$n_dupes_on_keys == 0)
stopifnot(all(contract$n_missing == 0))
write_json(contract, "artifacts/data_contract.json",
pretty = TRUE, auto_unbox = TRUE)
If any stopifnot fires, stop and fix it in dplyr first. R estimators silently drop NA rows downstream — this contract is the cheapest insurance against "why did N drop from 12,000 to 9,800 between Table 1 and Table 2?" referee questions.
Step 1 — Data import & cleaning
Deeper patterns: references/01-data-cleaning.md — every format (haven/readxl/data.table::fread/arrow::read_parquet/DBI), janitor::clean_names, naniar missingness viz, MCAR/MAR/MNAR triage with mice, validation with validate/assertr, panel structure checks.
library(tidyverse)
library(haven)
library(janitor)
library(naniar)
library(skimr)
df <- read_dta("raw/panel.dta") %>%
clean_names()
skim(df)
naniar::miss_var_summary(df)
naniar::vis_miss(df)
df <- df %>%
mutate(
year = as.integer(year),
wage = as.numeric(wage),
gender = as.factor(gender),
date = as.Date(date)
)
key_vars <- c("wage", "training", "worker_id", "year")
df <- df %>%
drop_na(all_of(key_vars))
cat("After dropping NA on keys:", nrow(df), "rows\n")
df <- df %>%
mutate(
tenure_missing = is.na(tenure),
tenure = if_else(is.na(tenure), median(tenure, na.rm = TRUE), tenure),
union = fct_explicit_na(as.factor(union), na_level = "unknown")
)
df <- df %>%
mutate(wage_z = scale(wage)[,1],
outlier_z4 = abs(wage_z) > 4)
cat("|z|>4 on wage:", sum(df$outlier_z4, na.rm = TRUE), "\n")
stopifnot(nrow(df %>% distinct(worker_id, year)) == nrow(df))
firm_chars <- read_dta("raw/firm_chars.dta")
n_before <- nrow(df)
df <- df %>%
left_join(firm_chars, by = "firm_id", relationship = "many-to-one")
stopifnot(nrow(df) == n_before)
df %>% count(year)
df %>% count(worker_id) %>% summary()
Key principle: dplyr + explicit stopifnot() assertions. No silent row drops downstream.
Step 2 — Variable construction & transformation
Deeper patterns: references/02-data-transformation.md — log/IHS/Box–Cox via MASS::boxcox, group winsorization with dplyr, scale() and bestNormalize, factor handling, lag/lead with dplyr::lag, panel timing.
library(DescTools)
df <- df %>%
mutate(
log_wage = log(pmax(wage, 1)),
ihs_assets = asinh(assets),
wage_w1 = DescTools::Winsorize(wage, probs = c(0.01, 0.99), na.rm = TRUE),
age_std = as.numeric(scale(age)),
age_sq = age^2,
trt_x_edu = training * edu
) %>%
group_by(industry, year) %>%
mutate(wage_w1_iy = DescTools::Winsorize(wage, probs = c(0.01, 0.99),
na.rm = TRUE)) %>%
ungroup() %>%
arrange(worker_id, year) %>%
group_by(worker_id) %>%
mutate(
log_wage_l1 = lag(log_wage, 1),
log_wage_f1 = lead(log_wage, 1),
d_log_wage = log_wage - lag(log_wage, 1),
wage_mean_i = mean(log_wage, na.rm = TRUE),
log_wage_dm = log_wage - wage_mean_i
) %>%
ungroup() %>%
group_by(worker_id) %>%
mutate(first_treat = ifelse(any(training == 1),
min(year[training == 1]), NA_real_)) %>%
ungroup() %>%
mutate(rel_time = year - first_treat,
never_treated = is.na(first_treat))
cpi <- read_csv("raw/cpi.csv")
df <- df %>%
left_join(cpi, by = "year") %>%
mutate(cpi_base = cpi[year == 2010][1],
wage_real = wage * cpi_base / cpi,
log_wage_real = log(pmax(wage_real, 1)))
Step 2.5 — Empirical strategy (write the equation + identifying assumption)
This is the heart of an AER paper. Before any code, write down the equation explicitly and state the identifying assumption. Vague identification language is the single most common reason a referee rejects an applied paper. Persist the strategy as strategy.md so it is a dated, version-controlled artifact — not a post-hoc rationalization written after seeing the coefficient.
Equation × identifying assumption × R estimator (decision table)
| Design | Estimating equation | Identifying assumption | R estimator |
|---|
| 2×2 DID | Y_it = α_i + λ_t + β·D_it + X'γ + ε_it | parallel trends conditional on X | `feols(y ~ i(treated, post, ref=0) |
| Event-study (CS / SA) | Y_it = α_i + λ_t + Σ_{e≠-1} β_e · 1{t-G_i = e} + ε_it | no anticipation + group-time PT | `feols(y ~ sunab(G, t) |
| 2SLS | Y_i = α + β·D_i + X'γ + ε_i; D_i = π·Z_i + X'δ + u_i | exclusion + relevance + monotonicity | `feols(y ~ X |
| Sharp RD | Y_i = α + β·1{X_i ≥ c} + f(X_i) + ε_i (local poly) | continuity of E[Y(0)|X] at c, no manipulation | rdrobust::rdrobust(y, x, c=0) (+ rddensity) |
| SCM | Ŷ_1t(0) = Σ_j ŵ_j Y_jt, τ_t = Y_1t − Ŷ_1t(0) for t≥T_0 | pre-period fit + interpolation validity | Synth::synth · gsynth::gsynth · synthdid::synthdid_estimate · tidysynth |
| Selection-on-observables (matching/IPW/DML) | Y_i = m(X_i) + β·D_i + ε_i (Robinson partialling-out) | unconfoundedness + overlap | MatchIt::matchit + lm · WeightIt · DoubleML::DoubleMLPLR · grf::causal_forest |
Design picker (when the user is unsure)
┌─ running var + cutoff ───────────────── RDD (rdrobust)
│
├─ exogenous instrument Z ─────────────── IV/2SLS (feols / AER::ivreg)
data + question ─┤
├─ pre/post × treat/control ─┬ 2 periods ── 2×2 DID (feols + i())
│ └ staggered ── CS / SA / BJS (att_gt / sunab / did_imputation)
│
├─ 1 treated unit + donor pool + long pre ── SCM (Synth / gsynth / synthdid)
│
├─ high-dim X, selection-on-observables ── ML causal (DoubleML / grf — see §B)
│
└─ none of the above ──────────────────── matching + sensitivity (MatchIt + EValue)
Pre-registration strategy.md template
strategy <- "\\
# Empirical Strategy (pre-registration)
**Frozen**: 2026-01-15 (Git SHA: <paste>)
**Population**: manufacturing workers, 2010–2020, balanced panel
**Treatment**: training (binary, staggered adoption)
**Outcome**: log_wage (CPI-deflated 2010 USD)
**Estimand**: ATT on the treated, dynamic horizon -4..+4
## Estimating equation (paste from §2.5 row that matches the design)
log_wage_it = α_i + λ_t + Σ_{e≠-1} β_e · 1{t - G_i = e} + ε_it
## Identifying assumption
1. No anticipation: E[Y_it(0) | t < G_i] = E[Y_it(0) | never-treated]
2. Group-time PT: Δ E[Y_it(0)] is the same across treatment cohorts
## Auto-flagged threats (must defend in §2)
- Selection of G_i on Y_i(0) → bacondecomp + HonestDiD sensitivity
- Spillover within firm → cluster at firm_id, also try firm_id × year
- Anticipation in pre-period → include lead in event study
## Fallback estimators (Step 6 robustness)
- Sun–Abraham via `feols(y ~ sunab(G, t) | i + t, data)`
- Borusyak-Jaravel-Spiess via `didimputation::did_imputation`
- Synthetic DID via `synthdid::synthdid_estimate`
"
writeLines(strategy, "artifacts/strategy.md")
Commit artifacts/strategy.md in the repo before running Step 5 / Step 6. The git log of this file is the analysis plan.
Step 3 — Descriptive statistics & Table 1
Deeper patterns: references/03-descriptive-stats.md — gtsummary::tbl_summary (the modern Table 1 standard), modelsummary::datasummary_balance with SMDs, tableone::CreateTableOne, correlation matrices with significance via corrplot / psych::corr.test, distribution plots via ggplot2.
library(gtsummary)
library(modelsummary)
df %>%
select(log_wage, age, edu, tenure, training) %>%
datasummary_skim()
df %>%
select(log_wage, age, edu, tenure, training) %>%
tbl_summary(
type = list(all_continuous() ~ "continuous2"),
statistic = all_continuous() ~ c("{N_nonmiss}", "{mean} ({sd})",
"{min} – {median} – {max}")
) %>%
bold_labels() %>%
as_kable_extra() %>%
kableExtra::save_kable("tables/table1_full.tex")
df %>%
select(log_wage, age, edu, tenure, female, training) %>%
tbl_summary(by = training, missing = "ifany") %>%
add_p() %>%
add_difference() %>%
add_n() %>%
modify_header(label = "**Variable**") %>%
bold_labels() %>%
as_gt() %>%
gt::gtsave("tables/table1_balance.html")
datasummary_balance(~ training,
data = df %>% select(training, age, edu, tenure, female),
output = "tables/table1_balance.tex")
library(corrplot); library(psych)
corr_obj <- corr.test(df %>% select(log_wage, age, edu, tenure, training),
method = "pearson")
corrplot(corr_obj$r, method = "color", type = "upper",
p.mat = corr_obj$p, sig.level = 0.05, insig = "blank",
addCoef.col = "black", number.cex = 0.7,
tl.col = "black", tl.srt = 45,
col = colorRampPalette(c("#B2182B","white","#2166AC"))(200))
library(ggplot2)
p1 <- ggplot(df, aes(log_wage, fill = factor(training))) +
geom_density(alpha = 0.5) +
scale_fill_manual(values = c("0" = "darkred", "1" = "navy"),
labels = c("Control", "Treated"), name = "") +
labs(x = "Log wage", y = "Density",
title = "Log-wage density by treatment") +
theme_classic()
p2 <- ggplot(df, aes(sample = log_wage)) +
stat_qq() + stat_qq_line() +
labs(title = "Normal Q-Q") + theme_classic()
cowplot::plot_grid(p1, p2, labels = "auto") %>%
ggsave("figures/distributions.pdf", plot = ., width = 10, height = 4)
df %>%
group_by(year, training) %>%
summarise(mean_log_wage = mean(log_wage, na.rm = TRUE), .groups = "drop") %>%
ggplot(aes(year, mean_log_wage, color = factor(training))) +
geom_line(linewidth = 1) + geom_point(size = 2) +
geom_vline(xintercept = policy_year, linetype = "dashed") +
scale_color_manual(values = c("0" = "darkred", "1" = "navy"),
labels = c("Control","Treated"), name = "") +
labs(x = "Year", y = "Mean log wage") + theme_classic()
ggsave("figures/trend_did.pdf", width = 7, height = 4)
Step 3.5 — Identification graphics (Section "Identification, graphical evidence")
AER convention: the identification figure precedes the regression table. The reader should see graphical evidence that PT holds / first stage is strong / RD jumps cleanly before you ask them to trust your point estimate.
3.5.1 Event-study figure + numerical pre-trends test (DID identification)
Pre-period coefficients ≈ 0 (with the −1 reference period normalized to zero) is the visual evidence for parallel trends. Pair the figure with a numerical pre-trends test so reviewers don't have to eyeball it.
library(fixest); library(ggplot2)
es <- feols(log_wage ~ sunab(first_treat, year) | worker_id + year,
data = df, cluster = ~ worker_id)
iplot(es,
xlab = "Years relative to treatment",
ylab = "Coefficient (ATT, 95% CI)",
main = "Figure 2a. Event-study coefficients (95% CI; ref. e = -1)")
ggsave("figures/fig2a_event_study.pdf", width = 7, height = 4)
ggsave("figures/fig2a_event_study.png", width = 7, height = 4, dpi = 300)
pre_idx <- grep("year::-", names(coef(es)))[!grepl("ref", names(coef(es)))]
W <- wald(es, names(coef(es))[pre_idx])
cat(sprintf("Pre-trends Wald χ² = %.2f, p = %.3f\n", W$stat, W$p))
library(bacondecomp)
bd <- bacon(log_wage ~ training, data = df,
id_var = "worker_id", time_var = "year")
ggplot(bd, aes(weight, estimate, color = type, shape = type)) +
geom_point(size = 2) +
labs(title = "Figure 2a-bis. Goodman-Bacon decomposition",
x = "Weight", y = "Estimate")
ggsave("figures/fig2a_bacon.pdf", width = 7, height = 4)
library(did)
cs <- att_gt(yname = "log_wage", tname = "year", idname = "worker_id",
gname = "first_treat", data = df,
control_group = "nevertreated", est_method = "dr",
clustervars = "firm_id")
ggdid(aggte(cs, type = "dynamic")) +
labs(title = "Figure 2a-ter. Dynamic ATT (Callaway-Sant'Anna)")
ggsave("figures/fig2a_csdid.pdf", width = 7, height = 4)
3.5.2 First-stage F-statistic + scatter (IV identification)
Rule of thumb: first-stage F ≥ 10 for OLS-style inference; F ≥ 23 for AR-equivalent inference (Stock–Yogo / Lee 2022). fixest::feols reports F automatically; AER::ivreg requires summary(..., diagnostics = TRUE).
iv <- feols(log_wage ~ age + edu | training ~ Z1 + Z2,
data = df, cluster = ~ firm_id)
summary(iv, stage = 1)
fitstat(iv, ~ ivf + ivwald + sargan + cd)
library(binsreg)
binsreg(y = df$training, x = df$Z1, w = df[, c("age","edu")],
nbins = 20, polyreg = 2, ci = c(3, 3))
ggsave("figures/fig2b_first_stage.pdf", width = 7, height = 4)
3.5.3 RD: McCrary density + canonical RD plot
The signature RD figure is rdplot (CCT-style binned scatter with local-polynomial fit on each side), paired with the McCrary manipulation test.
library(rdrobust); library(rddensity)
rdplot(y = df$outcome, x = df$running_var, c = 0,
p = 4, kernel = "triangular", binselect = "esmv",
title = "Figure 2c. RD plot")
ggsave("figures/fig2c_rdplot.pdf", width = 7, height = 4)
rdd <- rddensity(X = df$running_var, c = 0)
print(summary(rdd))
rdplotdensity(rdd, X = df$running_var,
title = "Figure 2c-bis. McCrary density (manipulation test)")
ggsave("figures/fig2c_mccrary.pdf", width = 7, height = 4)
3.5.4 Matching: love plot (standardized differences pre vs post)
library(MatchIt); library(cobalt)
m.out <- matchit(training ~ age + edu + tenure + firm_size,
data = df, method = "nearest", ratio = 1)
love.plot(m.out, threshold = 0.10,
var.order = "unadjusted", abs = TRUE,
title = "Figure 2d. Love plot — |SMD| pre vs post matching")
ggsave("figures/fig2d_loveplot.pdf", width = 7, height = 4)
3.5.5 SCM: synthetic-control trajectory + gap plot
For synthetic-control designs the canonical Figure 2 is the treated-vs-synthetic time series with treatment time annotated.
library(tidysynth)
sc <- df %>%
synthetic_control(outcome = log_wage, unit = unit_id, time = year,
i_unit = "treated_unit_name", i_time = 2015) %>%
generate_predictor(time_window = 2010:2014,
mean_age = mean(age, na.rm = TRUE),
mean_edu = mean(edu, na.rm = TRUE)) %>%
generate_weights() %>% generate_control()
plot_trends(sc); ggsave("figures/fig2e_synth_trajectory.pdf", width = 7, height = 4)
plot_differences(sc); ggsave("figures/fig2e_synth_gap.pdf", width = 7, height = 4)
library(synthdid)
sdid_setup <- panel.matrices(df, unit = "worker_id", time = "year",
outcome = "log_wage", treatment = "training")
sdid_fit <- synthdid_estimate(sdid_setup$Y, sdid_setup$N0, sdid_setup$T0)
plot(sdid_fit, control.name = "Synthetic DiD")
ggsave("figures/fig2e_sdid.pdf", width = 7, height = 4)
Identification-specific checks (PT for DID, weak-IV F, density for RD, common support for matching) are also auto-run inside the Step-5 estimators — don't duplicate the numerics here, but DO produce the figures: a referee scans the figures first.
Step 4 — Diagnostic statistical tests
Deeper patterns: references/04-statistical-tests.md — every classical test. lmtest/sandwich/car/tseries/urca/plm.
library(lmtest)
library(sandwich)
library(car)
library(tseries)
library(urca)
ols <- lm(log_wage ~ training + age + edu + tenure, data = df)
shapiro.test(sample(residuals(ols), min(5000, length(residuals(ols)))))
tseries::jarque.bera.test(residuals(ols))
bptest(ols)
bptest(ols, ~ I(fitted(ols)^2) + ., data = df)
dwtest(ols)
bgtest(ols, order = 4)
Box.test(residuals(ols), lag = 8, type = "Ljung-Box")
library(plm)
pdata <- pdata.frame(df, index = c("worker_id", "year"))
plm_fe <- plm(log_wage ~ training + age + edu, data = pdata, model = "within")
pbgtest(plm_fe)
pcdtest(plm_fe, test = "cd")
vif(ols)
kappa(model.matrix(ols), exact = TRUE)
adf.test(df$log_wage, k = 4)
kpss.test(df$log_wage, null = "Level")
plm_re <- plm(log_wage ~ training + age + edu, data = pdata, model = "random")
phtest(plm_fe, plm_re)
resettest(ols, power = 2:3, type = "fitted")
Decision table:
| Test | Null | Action if rejected |
|---|
shapiro.test / jarque.bera.test | residuals Normal | bootstrap CIs if N small |
bptest | homoskedastic | use HC3 via coeftest(ols, vcov = vcovHC(ols, "HC3")) or cluster |
dwtest / bgtest | no autocorr | HAC SEs (vcovHAC) or cluster by unit |
pbgtest (panel) | no panel autocorr | cluster by entity |
pcdtest | no CSD | Driscoll–Kraay (vcovDC) |
vif > 10 | — | drop / combine |
| ADF rejects + KPSS doesn't | stationary | levels |
| ADF doesn't reject | unit root | first-difference |
phtest | RE consistent | use FE |
Step 5 — Baseline empirical modeling (Section 4: Main Results)
Deeper patterns: references/05-modeling.md — every estimator. fixest is the workhorse.
This is the densest section of an applied paper. A modern AER §4 typically contains 2–3 multi-regression tables and one coefficient plot:
- Table 2 (main): progressive controls, 4–6 columns — Pattern A below
- Table 2-bis (design horse race): same coefficient under OLS / IV / DID / DML — Pattern B
- Table 2-ter (multi-outcome): same treatment, several outcomes side-by-side — Pattern C
- Figure 3 (coefplot): visual summary of β̂ and 95% CI across specs
Estimator routing (memorize this — getting it wrong silently produces nonsense):
- No FE / single low-card FE →
feols(y ~ X, data, cluster = ~i)
- High-dim FE →
feols(y ~ X | fe1 + fe2, data, cluster = ~i)
- Two-way cluster →
feols(..., cluster = ~ firm_id + year)
- 2SLS / IV →
feols(y ~ X | D ~ Z, data, cluster = ~ firm_id) (or AER::ivreg for diagnostics)
- DID / event-study →
feols(y ~ sunab(G, t) | i + t, data) (SA) · did::att_gt (CS) · didimputation::did_imputation (BJS)
Pick by identification strategy:
Cross-section, selection on observables → feols | MatchIt + lm | WeightIt
Panel + policy shock + parallel trends → feols / did::att_gt / sunab / didimputation / synthdid
Exogenous instrument → feols(... | endog ~ z) | AER::ivreg
Discontinuity → rdrobust + rddensity + rdmc
N=1 treated, long panel → Synth / gsynth / synthdid
Selection on observables + heterogeneity → WeightIt + cobalt; grf::causal_forest
Binary outcome → feglm or glm(family=binomial)
Count outcome → fepois
Canonical calls (the eight patterns A–H below are the AER table cookbook — modelsummary(...) and fixest::etable(...) are the two workhorses, equivalent to Stata outreg2/esttab and Python pf.etable/Stargazer).
5.A Pattern A — Progressive controls (the canonical Table 2)
Stable β̂ across columns ⇒ less concern that selection on observables is driving the estimate (Oster 2019 selection-stability logic; quantified in Step 6).
library(fixest); library(modelsummary)
m1 <- feols(log_wage ~ training, data = df, cluster = ~ firm_id)
m2 <- feols(log_wage ~ training + age + edu, data = df, cluster = ~ firm_id)
m3 <- feols(log_wage ~ training + age + edu + tenure + firm_size, data = df, cluster = ~ firm_id)
m4 <- feols(log_wage ~ training + age + edu + tenure + firm_size | industry + year,
data = df, cluster = ~ firm_id)
m5 <- feols(log_wage ~ training + age + edu + tenure + firm_size | worker_id + year,
data = df, cluster = ~ firm_id)
m6 <- feols(log_wage ~ training + age + edu + tenure + firm_size | worker_id + year + industry^year,
data = df, cluster = ~ firm_id)
modelsummary(
list("(1) Baseline" = m1,
"(2) +Demog" = m2,
"(3) +Labor-mkt" = m3,
"(4) Ind×Yr FE" = m4,
"(5) Worker FE" = m5,
"(6) Worker FE+Ind×Yr" = m6),
output = "tables/table2_main.tex",
stars = c("*" = 0.1, "**" = 0.05, "***" = 0.01),
gof_omit = "BIC|AIC|F|Log|Adj",
coef_map = c("training" = "Job training",
"age" = "Age", "edu" = "Education",
"tenure" = "Tenure", "firm_size" = "Firm size"),
notes = c("Cluster-robust SE in parentheses, clustered at firm_id.",
"* p<0.10, ** p<0.05, *** p<0.01.")
)
modelsummary(list("(1)"=m1,"(2)"=m2,"(3)"=m3,"(4)"=m4,"(5)"=m5,"(6)"=m6),
output = "tables/table2_main.docx")
AER convention: show ALL controls (and the intercept). Pass NEITHER keep = NOR coef_omit = so every parameter is visible. Use coef_map = c("training" = "Training") (single mapping) only when a focal-coefficient-only table is intentional (interaction-form heterogeneity, IV first-stage triplet); use coef_omit = "Intercept" only when you want to suppress the constant for paper aesthetics.
5.B Pattern B — Design horse race (Table 2-bis)
Show the same coefficient of interest under multiple identification strategies. This is the AER credibility move: convergent evidence across designs each making different identifying assumptions.
library(fixest); library(AER); library(did); library(MatchIt); library(WeightIt)
ols <- feols(log_wage ~ training + age + edu + tenure | industry + year,
data = df, cluster = ~ firm_id)
iv <- feols(log_wage ~ age + edu + tenure | training ~ Z1 + Z2,
data = df, cluster = ~ firm_id)
cs <- att_gt(yname = "log_wage", tname = "year", idname = "worker_id",
gname = "first_treat", data = df,
control_group = "nevertreated", est_method = "dr",
clustervars = "firm_id")
psm <- matchit(training ~ age + edu + tenure, data = df,
method = "nearest", ratio = 1)
psm_lm <- lm(log_wage ~ training + age + edu + tenure,
data = match.data(psm), weights = weights)
ebal <- weightit(training ~ age + edu + tenure, data = df, method = "ebal")
ebal_lm <- lm(log_wage ~ training + age + edu + tenure,
data = df, weights = ebal$weights)
modelsummary(
list("(1) OLS+FE" = ols,
"(2) 2SLS" = iv,
"(3) CS-DID" = aggte(cs, type = "simple"),
"(4) PSM" = psm_lm,
"(5) Entropy bal." = ebal_lm),
output = "tables/table2b_designs.tex",
stars = c("*" = 0.1, "**" = 0.05, "***" = 0.01),
coef_map = c("training" = "Job training (β̂)"),
gof_omit = "BIC|AIC|F|Log|Adj",
notes = "Convergent evidence: same β̂ under five identification strategies."
)
5.C Pattern C — Multi-outcome table (same X, several Y's)
ys <- c("log_wage", "weeks_employed", "left_firm", "promoted")
multi_y <- lapply(ys, function(y)
feols(as.formula(paste(y, "~ training + age + edu + tenure | industry + year")),
data = df, cluster = ~ firm_id))
names(multi_y) <- ys
modelsummary(multi_y,
output = "tables/table2c_multi_outcome.tex",
stars = c("*" = 0.1, "**" = 0.05, "***" = 0.01),
coef_map = c("training" = "Training"),
notes = "Each column is a separate regression on the labelled outcome.")
5.D Pattern D — Stacked Panel A / Panel B table
Same model family, two horizons (short-run / long-run) or two samples. Use gt::gt_group() to stack two modelsummary blocks with panel headers.
library(gt)
panelA <- list(
"(1) Industry FE" = feols(wage_t1 ~ training + X | industry + year, data = df, cluster = ~ firm_id),
"(2) Worker FE" = feols(wage_t1 ~ training + X | worker_id + year, data = df, cluster = ~ firm_id))
panelB <- list(
"(1) Industry FE" = feols(wage_t5 ~ training + X | industry + year, data = df, cluster = ~ firm_id),
"(2) Worker FE" = feols(wage_t5 ~ training + X | worker_id + year, data = df, cluster = ~ firm_id))
ms_A <- modelsummary(panelA, output = "gt") %>%
tab_header(title = "Panel A. Short-run (1 year)")
ms_B <- modelsummary(panelB, output = "gt") %>%
tab_header(title = "Panel B. Long-run (5 years)")
gt_group(ms_A, ms_B) %>%
gtsave("tables/table2d_horizons.tex")
gt_group(ms_A, ms_B) %>%
gtsave("tables/table2d_horizons.docx")
5.E Pattern E — IV reporting triplet (first-stage / reduced-form / 2SLS)
The textbook AER IV table presents the first stage, the reduced form, and the 2SLS in three columns so the reader can verify Wald-ratio = RF / FS.
fs <- feols(training ~ Z + age + edu | industry + year, data = df, cluster = ~ firm_id)
rf <- feols(log_wage ~ Z + age + edu | industry + year, data = df, cluster = ~ firm_id)
iv2 <- feols(log_wage ~ age + edu | training ~ Z, data = df, cluster = ~ firm_id)
modelsummary(
list("(1) First stage" = fs,
"(2) Reduced form" = rf,
"(3) 2SLS" = iv2),
output = "tables/table2e_iv_triplet.tex",
stars = c("*" = 0.1, "**" = 0.05, "***" = 0.01),
coef_map = c("Z" = "Instrument Z", "training" = "Training (endog.)"),
gof_map = list(list(raw = "ivf", clean = "First-stage F", fmt = 2)),
notes = "Wald ratio: $\\hat\\beta_{2SLS} = \\hat\\beta_{RF} / \\hat\\pi_{FS}$."
)
IV triplet is intentionally focal: show only Z + endogenous regressor so the reader can eyeball the Wald ratio. Drop coef_map= only if a referee asks for the full coefficient list.
5.F Pattern F — Causal-orchestrator main via did::att_gt / synthdid / grf::causal_forest
For DID / SCM / matching / forest mains, the modern R estimator returns a self-contained estimate + automatic placebos / pre-trends / overlap diagnostics. Pipe into modelsummary via the auto-tidiers.
cs <- att_gt(yname = "log_wage", tname = "year", idname = "worker_id",
gname = "first_treat", data = df,
control_group = "nevertreated", est_method = "dr",
clustervars = "firm_id")
print(aggte(cs, type = "group"))
print(aggte(cs, type = "dynamic", min_e = -4, max_e = 4))
library(synthdid)
sdid_setup <- panel.matrices(df, unit="worker_id", time="year",
outcome="log_wage", treatment="training")
sdid_fit <- synthdid_estimate(sdid_setup$Y, sdid_setup$N0, sdid_setup$T0)
print(summary(sdid_fit))
library(grf)
cf <- causal_forest(X = as.matrix(df[, c("age","edu","tenure","firm_size")]),
Y = df$log_wage, W = df$training, num.trees = 4000)
average_treatment_effect(cf, target.sample = "treated")
test_calibration(cf)
variable_importance(cf)
5.G Pattern G — Subgroup modelsummary (Table 3, see Step 7)
One column per subgroup. Detailed code in §Step 7 — Heterogeneity.
5.H Pattern H — Robustness master (Table A1, see Step 6)
Stack every robustness specification next to the baseline. Detailed code in §Step 6.
Canonical estimator commands (the underlying primitives)
library(fixest)
ols <- feols(log_wage ~ training + age + edu + tenure,
data = df, cluster = ~ firm_id)
summary(ols)
fe <- feols(log_wage ~ training + age + edu + tenure | worker_id + year,
data = df, cluster = ~ worker_id)
fe_mw <- feols(log_wage ~ training | worker_id + year,
data = df, cluster = ~ worker_id + firm_id)
fe_hd <- feols(log_wage ~ training | worker_id + industry^year,
data = df, cluster = ~ firm_id)
did22 <- feols(log_wage ~ i(treated, post, ref = 0) + age + edu,
data = df, cluster = ~ worker_id)
did22 <- feols(log_wage ~ i(treated, post, ref = 0) | worker_id + year,
data = df, cluster = ~ worker_id)
es <- feols(log_wage ~ i(rel_time, ref = -1) | worker_id + year,
data = df %>% filter(!is.na(first_treat)),
cluster = ~ worker_id)
iplot(es,
xlab = "Years relative to treatment",
main = "Event study")
library(did)
cs <- att_gt(yname = "log_wage", tname = "year", idname = "worker_id",
gname = "first_treat", data = df,
control_group = "nevertreated",
est_method = "dr",
clustervars = "firm_id")
ggdid(cs)
sa <- feols(log_wage ~ sunab(first_treat, year) | worker_id + year,
data = df, cluster = ~ worker_id)
iplot(sa, sub.title = "Sun-Abraham (2021)")
library(didimputation)
bjs <- did_imputation(data = df, yname = "log_wage", gname = "first_treat",
tname = "year", idname = "worker_id",
horizon = 0:5, pretrends = -5:-1,
cluster_var = "worker_id")
library(synthdid)
sdid_setup <- synthdid::panel.matrices(df, unit = "worker_id", time = "year",
outcome = "log_wage", treatment = "training")
sdid_fit <- synthdid_estimate(sdid_setup$Y, sdid_setup$N0, sdid_setup$T0)
iv <- feols(log_wage ~ age + edu | training ~ draft_lottery + z2,
data = df, cluster = ~ firm_id)
summary(iv, stage = 1)
fitstat(iv, ~ ivf + ivwald + sargan)
library(AER)
iv_aer <- ivreg(log_wage ~ training + age + edu |
draft_lottery + z2 + age + edu, data = df)
summary(iv_aer, vcov. = sandwich, diagnostics = TRUE)
library(rdrobust); library(rddensity)
rd <- rdrobust(y = df$outcome, x = df$running_var, c = 0,
kernel = "triangular", bwselect = "mserd")
summary(rd)
rdplot(y = df$outcome, x = df$running_var, c = 0)
rddensity(X = df$running_var, c = 0)
logit <- feglm(employed ~ training + age + edu | firm_id + year,
data = df, family = binomial(link = "logit"),
cluster = ~ firm_id)
library(marginaleffects)
avg_slopes(logit, variables = "training")
pois <- fepois(citations ~ training + age | firm_id + year,
data = df, cluster = ~ firm_id)
Step 6 — Robustness battery
Deeper patterns: references/06-robustness.md — modelsummary for M1–M6; clubSandwich/fwildclusterboot; bacondecomp/HonestDiD/robomit; ri2 randomization inference.
library(modelsummary)
library(fixest)
m1 <- feols(log_wage ~ training, data = df, cluster = ~ firm_id)
m2 <- feols(log_wage ~ training + age + edu, data = df, cluster = ~ firm_id)
m3 <- feols(log_wage ~ training + age + edu + tenure | worker_id,
data = df, cluster = ~ worker_id)
m4 <- feols(log_wage ~ training + age + edu + tenure | worker_id + year,
data = df, cluster = ~ worker_id)
m5 <- feols(log_wage ~ training + age + edu + tenure | worker_id + year + region,
data = df, cluster = ~ worker_id)
m6 <- feols(log_wage ~ training + age + edu + tenure | worker_id + year + industry^year,
data = df, cluster = ~ worker_id)
modelsummary(list("(1)" = m1, "(2)" = m2, "(3)" = m3,
"(4)" = m4, "(5)" = m5, "(6)" = m6),
stars = c('*' = .1, '**' = .05, '***' = .01),
gof_omit = "BIC|AIC|F|Log",
coef_map = c("training" = "Training",
"age" = "Age", "edu" = "Education", "tenure" = "Tenure"),
output = "tables/table_main.tex")
for (cl in c("worker_id", "firm_id", "industry", "state")) {
fit <- feols(log_wage ~ training | worker_id + year, data = df,
cluster = as.formula(paste0("~", cl)))
cat(cl, ": b=", coef(fit)["training"], " se=", se(fit)["training"], "\n")
}
library(fwildclusterboot)
boot <- boottest(m4, param = "training", clustid = "state",
B = 9999, seed = 42)
summary(boot)
splits <- list(
"Female=0" = df %>% filter(female == 0),
"Female=1" = df %>% filter(female == 1),
"Young (<40)" = df %>% filter(age < 40),
"Old (>=40)" = df %>% filter(age >= 40)
)
sub_fits <- imap(splits, ~ feols(log_wage ~ training | worker_id + year,
data = .x, cluster = ~ worker_id))
modelsummary(sub_fits, stars = TRUE)
df_placebo <- df %>%
mutate(fake_first = first_treat - 3,
fake_post = year >= fake_first) %>%
filter(year < first_treat)
feols(log_wage ~ fake_post | worker_id + year,
data = df_placebo, cluster = ~ worker_id)
library(ri2)
ri_out <- conduct_ri(formula = log_wage ~ training + age + edu,
declaration = randomizr::declare_ra(N = nrow(df),
prob = mean(df$training)),
assignment = "training",
sharp_hypothesis = 0,
data = df,
sims = 1000)
summary(ri_out); plot(ri_out)
library(bacondecomp)
bacon_out <- bacon(log_wage ~ training,
data = df, id_var = "worker_id", time_var = "year")
ggplot(bacon_out, aes(weight, estimate, color = type)) + geom_point()
ggsave("figures/bacon.pdf")
library(HonestDiD)
honest_out <- createSensitivityResults(betahat = es$coefficients,
sigma = vcov(es),
numPrePeriods = 5, numPostPeriods = 5,
Mbarvec = seq(0, 0.5, by = 0.05))
createSensitivityPlot(honest_out, originalResults = honest_out$mainResult)
ggsave("figures/honestdid.pdf")
library(robomit)
o_test(y = "log_wage", x = "training",
con = "age + edu + tenure | worker_id + year",
id = "worker_id", time = "year",
data = df, R2max = 1.3 * fitstat(m6, "r2"), beta = 0)
library(modelsummary); library(MatchIt); library(WeightIt)
base <- feols(log_wage ~ training + age + edu + tenure | industry + year,
data = df, cluster = ~ firm_id)
no99 <- feols(log_wage ~ training + age + edu + tenure | industry + year,
data = df %>% filter(wage < quantile(wage, 0.99, na.rm = TRUE)),
cluster = ~ firm_id)
balpan <- feols(log_wage ~ training + age + edu + tenure | industry + year,
data = df %>% group_by(worker_id) %>%
filter(n_distinct(year) == max(n_distinct(year))) %>% ungroup(),
cluster = ~ firm_id)
dropearly <- feols(log_wage ~ training + age + edu + tenure | industry + year,
data = df %>% filter(first_treat > 2008), cluster = ~ firm_id)
wfe <- feols(log_wage ~ training + age + edu + tenure | worker_id + year,
data = df, cluster = ~ firm_id)
cl2way <- feols(log_wage ~ training + age + edu + tenure | industry + year,
data = df, cluster = ~ firm_id + year)
logy <- feols(log(wage + 1) ~ training + age + edu + tenure | industry + year,
data = df, cluster = ~ firm_id)
ihsy <- feols(asinh(wage) ~ training + age + edu + tenure | industry + year,
data = df, cluster = ~ firm_id)
m_psm <- matchit(training ~ age + edu + tenure + firm_size, data = df, method = "nearest")
psm_lm <- lm(log_wage ~ training + age + edu + tenure, data = match.data(m_psm), weights = weights)
ebal_w <- weightit(training ~ age + edu + tenure + firm_size, data = df, method = "ebal")
ebal_lm <- lm(log_wage ~ training + age + edu + tenure, data = df, weights = ebal_w$weights)
modelsummary(
list("(1) Baseline" = base,
"(2) Drop top 1%" = no99,
"(3) Balanced" = balpan,
"(4) Drop early" = dropearly,
"(5) Worker FE" = wfe,
"(6) 2-way cluster" = cl2way,
"(7) log Y" = logy,
"(8) IHS Y" = ihsy,
"(9) PSM" = psm_lm,
"(10) Entropy bal." = ebal_lm),
output = "tables/tableA1_robustness.tex",
stars = c("*" = 0.1, "**" = 0.05, "***" = 0.01),
coef_map = c("training" = "Training (β̂)"),
gof_omit = "BIC|AIC|F|Log|Adj",
notes = "Each column is one robustness check. β̂ on training is the focal coefficient."
)
library(specr); library(ggplot2)
specs <- setup(data = df,
y = c("log_wage", "ihs_wage"),
x = "training",
model = c("feols"),
controls = c("age", "edu", "tenure", "firm_size"),
subsets = list(industry = c("manuf", "construction", "transport")))
results <- specr(specs)
plot(results, choices = c("x", "y", "controls", "subsets"))
ggsave("figures/fig5_spec_curve.pdf", width = 10, height = 6)
ggsave("figures/fig5_spec_curve.png", width = 10, height = 6, dpi = 300)
library(HonestDiD)
es_pre <- coef(es)[grep("year::-", names(coef(es)))]
es_post <- coef(es)[grep("year::[0-9]", names(coef(es)))]
honest_out <- createSensitivityResults(betahat = c(es_pre, es_post),
sigma = vcov(es)[c(names(es_pre), names(es_post)),
c(names(es_pre), names(es_post))],
numPrePeriods = length(es_pre),
numPostPeriods = length(es_post),
Mbarvec = seq(0, 0.5, by = 0.05))
createSensitivityPlot(honest_out, originalResults = honest_out$mainResult)
ggsave("figures/fig6_honestdid.pdf", width = 7, height = 4)
library(EValue)
evalue(RR(1.45), lo = 1.10, hi = 1.91)
Step 7 — Further analysis
Deeper patterns: references/07-further-analysis.md — marginaleffects is the post-estimation workhorse; mediation::mediate for Imai mediation; lavaan for SEM; grf::causal_forest for CATE.
library(marginaleffects)
library(fixest)
het <- feols(log_wage ~ i(female, training, ref = 0) + age + edu | worker_id + year,
data = df, cluster = ~ worker_id)
summary(het)
iplot(het)
het_c <- feols(log_wage ~ training * tenure + age + edu | worker_id + year,
data = df, cluster = ~ worker_id)
plot_slopes(het_c, variables = "training",
condition = list(tenure = seq(0, 20, by = 2))) +
geom_hline(yintercept = 0, linetype = "dashed") +
labs(x = "Tenure", y = "Marginal effect of training")
ggsave("figures/het_tenure.pdf", width = 6, height = 4)
ddd <- feols(log_wage ~ treated * post * high_exposure | worker_id + year,
data = df, cluster = ~ firm_id)
out_ladder <- list()
for (y in c("hours_worked", "productivity", "log_wage")) {
out_ladder[[y]] <- feols(as.formula(paste(y, "~ training | worker_id + year")),
data = df, cluster = ~ worker_id)
}
modelsummary(out_ladder, stars = TRUE,
coef_map = c("training" = "Training"),
output = "tables/outcome_ladder.tex")
library(mediation)
med_M <- lm(hours_worked ~ training + age + edu, data = df)
med_Y <- lm(log_wage ~ training + hours_worked + age + edu, data = df)
med <- mediate(med_M, med_Y, treat = "training", mediator = "hours_worked",
boot = TRUE, sims = 1000)
summary(med); plot(med)
medsens <- medsens(med, rho.by = 0.05, effect.type = "indirect")
plot(medsens)
library(grf)
cf <- causal_forest(X = as.matrix(df %>% select(age, edu, tenure, firm_size)),
Y = df$log_wage, W = df$training,
num.trees = 2000, min.node.size = 5)
df$tau_hat <- predict(cf)$predictions
variable_importance(cf)
average_treatment_effect(cf, target.sample = "all")
ggplot(df, aes(tenure, tau_hat)) +
geom_smooth(method = "loess", se = TRUE) +
labs(x = "Tenure", y = "Estimated CATE")
ggsave("figures/cate_tenure.pdf")
library(splines)
dr <- feols(log_wage ~ ns(training_hours, df = 4) + age + edu | worker_id + year,
data = df, cluster = ~ worker_id)
plot_predictions(dr, condition = "training_hours")
Step 8 — Publication tables & figures
This step is mandatory — every analysis run produces all 5 required tables (T1–T5) and all 4 required figures (F1–F4) defined in the Default Output Spec at the top of this skill. Do not skip Step 8 because "the regression already ran". A coefficient without a table and a figure is not how applied economics communicates a result.
Deeper patterns: references/08-tables-plots.md — modelsummary is the modern default (LaTeX/Word/HTML/Excel from one call); kableExtra for further LaTeX styling; gt for HTML/Word; ggplot2 + iplot + ggpubr + cowplot + binsreg for figures.
library(modelsummary)
library(kableExtra)
library(gt)
library(fixest)
library(ggplot2)
modelsummary(
list("(1) Raw" = m1,
"(2) +Demog" = m2,
"(3) +Tenure" = m3,
"(4) +Unit FE" = m4,
"(5) +2-way FE" = m5,
"(6) +Ind×Yr FE" = m6),
stars = c('*' = .1, '**' = .05, '***' = .01),
coef_map = c("training" = "Training",
"age" = "Age", "edu" = "Education", "tenure" = "Tenure"),
gof_map = list(
list("raw" = "nobs", "clean" = "N", "fmt" = 0),
list("raw" = "r.squared", "clean" = "R²", "fmt" = 3),
list("raw" = "adj.r.squared","clean" = "Adj. R²", "fmt" = 3)
),
notes = "Cluster-robust SE at worker_id in parentheses. * p<0.10, ** p<0.05, *** p<0.01.",
output = "tables/table2_main.tex"
)
modelsummary(list("(1)"=m1, "(2)"=m2, "(3)"=m3, "(4)"=m4, "(5)"=m5, "(6)"=m6),
stars = TRUE, output = "tables/table2_main.docx")
library(gtsummary)
tbl1 <- df %>%
select(log_wage, age, edu, tenure, female, training) %>%
tbl_summary(by = training, missing = "ifany",
statistic = all_continuous() ~ "{mean} ({sd})") %>%
add_p() %>% add_difference() %>% add_n() %>% bold_labels()
tbl1 %>% as_kable_extra(format = "latex", booktabs = TRUE) %>%
kableExtra::save_kable("tables/table1_balance.tex")
tbl1 %>% as_flex_table() %>%
flextable::save_as_docx(path = "tables/table1_balance.docx")
ladder <- list()
for (y in c("hours_worked", "productivity", "log_wage")) {
ladder[[y]] <- feols(as.formula(paste(y, "~ training + age + edu + tenure | worker_id + year")),
data = df, cluster = ~ worker_id)
}
modelsummary(ladder,
stars = c('*' = .1, '**' = .05, '***' = .01),
coef_map = c("training" = "Training"),
notes = "Each column is a separate regression on the labelled outcome. Cluster-robust SE at worker_id.",
output = "tables/table3_mechanism.tex")
het_specs <- list(
"All" = df,
"Female=0" = df %>% filter(female == 0),
"Female=1" = df %>% filter(female == 1),
"Age<40" = df %>% filter(age < 40),
"Age≥40" = df %>% filter(age >= 40),
"Manuf." = df %>% filter(industry == "manufacturing")
)
het_models <- imap(het_specs,
~ feols(log_wage ~ training + age + edu + tenure | worker_id + year,
data = .x, cluster = ~ worker_id))
modelsummary(het_models,
stars = c('*' = .1, '**' = .05, '***' = .01),
coef_map = c("training" = "Training"),
notes = "Cluster-robust SE at worker_id. Wald p-values for cross-subgroup equality should accompany this table — see references/07.",
output = "tables/table4_heterogeneity.tex")
rob <- list(
"Baseline" = feols(log_wage ~ training | worker_id + year, data = df,
cluster = ~ worker_id),
"Cluster=Firm" = feols(log_wage ~ training | worker_id + year, data = df,
cluster = ~ firm_id),
"2-way Cluster" = feols(log_wage ~ training | worker_id + year, data = df,
cluster = ~ worker_id + firm_id),
"Winsor 1/99" = feols(log_wage ~ training | worker_id + year,
data = df %>% mutate(log_wage = DescTools::Winsorize(log_wage,
probs = c(.01,.99),
na.rm = TRUE)),
cluster = ~ worker_id),
"Drop Manuf." = feols(log_wage ~ training | worker_id + year,
data = df %>% filter(industry != "manufacturing"),
cluster = ~ worker_id),
"Placebo (-3)" = feols(log_wage ~ fake_post | worker_id + year,
data = df %>% filter(year < first_treat),
cluster = ~ worker_id)
)
modelsummary(rob,
stars = c('*' = .1, '**' = .05, '***' = .01),
output = "tables/table5_robustness.tex")
modelplot(list("(1)"=m1, "(2)"=m2, "(3)"=m3, "(4)"=m4, "(5)"=m5, "(6)"=m6),
coef_map = c("training" = "Training"),
conf_level = 0.95) +
geom_vline(xintercept = 0, linetype = "dashed", alpha = 0.5) +
labs(x = "Coefficient on training (95% CI)", y = "Specification",
title = "Effect of training across specifications") +
theme_classic(base_size = 11)
ggsave("figures/fig3_coefplot.pdf", width = 6, height = 4)
ggsave("figures/fig3_coefplot.png", width = 6, height = 4, dpi = 300)
pdf("figures/fig2_event_study.pdf", width = 7, height = 4)
iplot(es,
xlab = "Years relative to treatment",
ylab = "Coefficient (ATT, 95% CI)",
main = "Event study: dynamic effect of training",
ref.line = -0.5)
dev.off()
png("figures/fig2_event_study.png", width = 2100, height = 1200, res = 300)
iplot(es,
xlab = "Years relative to treatment",
ylab = "Coefficient (ATT, 95% CI)",
main = "Event study: dynamic effect of training",
ref.line = -0.5)
dev.off()
library(HonestDiD)
honest_out <- createSensitivityResults(betahat = es$coefficients,
sigma = vcov(es),
numPrePeriods = 5, numPostPeriods = 5,
Mbarvec = seq(0, 0.5, by = 0.05))
sens_plot <- createSensitivityPlot(honest_out, originalResults = honest_out$mainResult)
ggsave("figures/fig4_sensitivity.pdf", plot = sens_plot, width = 7, height = 4)
ggsave("figures/fig4_sensitivity.png", plot = sens_plot, width = 7, height = 4, dpi = 300)
df %>%
group_by(year, training) %>%
summarise(mean_log_wage = mean(log_wage, na.rm = TRUE), .groups = "drop") %>%
ggplot(aes(year, mean_log_wage, color = factor(training))) +
geom_line(linewidth = 1) + geom_point(size = 2) +
geom_vline(xintercept = policy_year, linetype = "dashed", color = "gray40") +
scale_color_manual(values = c("0" = "darkred", "1" = "navy"),
labels = c("Control", "Treated"), name = "") +
labs(x = "Year", y = "Mean log wage",
title = "Treated vs control trend") +
theme_classic(base_size = 11) +
theme(legend.position = "bottom")
ggsave("figures/fig1_trend.pdf", width = 7, height = 4)
ggsave("figures/fig1_trend.png", width = 7, height = 4, dpi = 300)
library(binsreg)
binsreg(y = df$log_wage, x = df$tenure, w = df %>% select(age, edu, female))
ggsave("figures/figA_binscatter.pdf", width = 6, height = 4)
library(cowplot)
theme_set(theme_classic(base_size = 11) +
theme(legend.position = "bottom",
plot.title = element_text(face = "bold")))
Deliverables checklist (verify before declaring the run complete):
[ ] tables/table1_balance.tex [ ] figures/fig1_trend.pdf
[ ] tables/table2_main.tex ★ [ ] figures/fig2_event_study.pdf
[ ] tables/table3_mechanism.tex [ ] figures/fig3_coefplot.pdf
[ ] tables/table4_heterogeneity.tex
[ ] tables/table5_robustness.tex [ ] figures/fig4_sensitivity.pdf
[ ] tables/tableA1_robustness.tex [ ] figures/fig5_spec_curve.pdf
[ ] artifacts/sample_construction.json (footnote 4)
[ ] artifacts/data_contract.json
[ ] artifacts/result.json (reproducibility stamp — see 8m)
8m. Reproducibility stamp
The single artifact a journal's replication office (or a future co-author) needs to reproduce the headline number. Persist R version, seed, dataset hash, baseline coefficient + CI, and pointers to the protocol/contract:
library(jsonlite); library(digest)
b_hat <- coef(base)["training"]
se_b <- se(base)["training"]
ci <- c(b_hat - 1.96 * se_b, b_hat + 1.96 * se_b)
stamp <- list(
R_version = R.version.string,
fixest_version = as.character(packageVersion("fixest")),
modelsummary_version = as.character(packageVersion("modelsummary")),
seed = 42,
dataset_sha256 = substr(digest::digest(df, algo = "sha256"), 1, 16),
n_obs = base$nobs,
estimand = "ATT",
estimator = "fixest::feols",
estimate = unname(b_hat),
se_cluster = unname(se_b),
ci95 = unname(ci),
pre_registration = "artifacts/strategy.md",
data_contract = "artifacts/data_contract.json",
sample_log = "artifacts/sample_construction.json",
paper_bundle = "tables/table2_main.tex"
)
write_json(stamp, "artifacts/result.json", pretty = TRUE, auto_unbox = TRUE)
Commit artifacts/result.json alongside the paper PDF. A referee should be able to run Rscript master.R and bit-identically reproduce this JSON.
§A — Epidemiology / Public Health Mode
When the user's wording flags Mode A (target-trial emulation / IPTW / TMLE / MR / STROBE / 流行病学 / 公共健康 / RWE / cohort), the 8 steps still apply — but Step 5 swaps the OLS-and-FE stack for the doubly-robust + survival + MR triplet, and the deliverables follow STROBE / TRIPOD-AI conventions. Steps 1–4 (cleaning, construction, Table 1, diagnostics) and Step 8 (tables/figures export) are identical to the Default mode.
Package footprint (install on top of the Default stack):
install.packages(c(
"WeightIt", "PSweight", "cobalt",
"gfoRmula",
"tmle", "ltmle",
"survival", "survminer", "flexsurv",
"MendelianRandomization", "TwoSampleMR",
"MRPRESSO",
"EValue"
))
A.0 Cohort construction + target-trial protocol
Write the protocol before touching the data. Save it as protocol.yml and quote it in the paper.
library(dplyr)
cohort <- df |>
filter(age >= 40, age <= 75, prior_MI == 0) |>
mutate(
t0 = coalesce(statin_initiation_date, enrollment_date),
event_5y = as.integer((MI_date - t0) <= 365 * 5 & !is.na(MI_date)),
time_at_risk = pmin(as.numeric(censor_date - t0), 365 * 5)
)
A.1 Table 1 by exposure (identical to Default Step 3)
Use the same gtsummary::tbl_summary from Step 3, just by = A. E-values for unmeasured confounding go in the footer.
library(gtsummary)
cohort |>
select(A, age, edu, smoke, bmi, ldl, sbp) |>
tbl_summary(by = A, missing = "ifany") |>
add_difference() |>
add_p() |>
bold_labels()
A.2 DAG + propensity-score overlap (positivity check)
library(WeightIt); library(cobalt)
w_out <- weightit(A ~ age + edu + smoke + bmi + ldl + sbp,
data = cohort, method = "glm", estimand = "ATE")
bal.plot(w_out, var.name = "prop.score", which = "both")
ggsave("figures/figA2_ps_overlap.pdf")
love.plot(w_out, threshold = 0.1, abs = TRUE)
ggsave("figures/figA2_love.pdf")
A.3 IPTW + g-formula + TMLE doubly-robust triplet (Step 5 swap)
The "AER Table 2" of epi: a 3-column table where each column is one of {IPTW-MSM, g-formula, TMLE}, so the reader can confirm doubly-robust agreement.
library(survey)
des <- svydesign(ids = ~1, data = cohort, weights = w_out$weights)
msm <- svyglm(event_5y ~ A, design = des, family = quasibinomial())
RD_iptw <- coef(msm)["A"]; CI_iptw <- confint(msm)["A", ]
library(gfoRmula)
gf <- gformula_binary_eof(
obs_data = cohort,
id = "subject_id", time_name = "t", outcome_name = "event_5y",
covnames = c("age","edu","smoke","bmi","ldl","sbp"),
intvars = list("A"), interventions = list(list(c(static, 1)), list(c(static, 0))),
ref_int = 1, time_points = 1, basecovs = c("age","edu","smoke","bmi","ldl","sbp")
)
library(tmle)
fit_tmle <- tmle(
Y = cohort$event_5y, A = cohort$A,
W = cohort[, c("age","edu","smoke","bmi","ldl","sbp")],
family = "binomial",
Q.SL.library = c("SL.glm","SL.glmnet","SL.ranger"),
g.SL.library = c("SL.glm","SL.glmnet","SL.ranger")
)
RD_tmle <- fit_tmle$estimates$ATE$psi
CI_tmle <- fit_tmle$estimates$ATE$CI
library(modelsummary)
tableA3 <- tibble::tribble(
~Estimator, ~RD, ~`95% CI`,
"IPTW-MSM", RD_iptw, sprintf("[%.3f, %.3f]", CI_iptw[1], CI_iptw[2]),
"g-formula", gf$result[2,"mean"] - gf$result[1,"mean"], "—",
"TMLE", RD_tmle, sprintf("[%.3f, %.3f]", CI_tmle[1], CI_tmle[2])
)
modelsummary::datasummary_df(tableA3, output = "tables/tableA3_dr_triplet.tex")
A.4 Survival outcomes — KM / Cox / AFT / RMST
library(survival); library(survminer); library(flexsurv)
fit_km <- survfit(Surv(time_at_risk, event_5y) ~ A, data = cohort)
ggsurvplot(fit_km, conf.int = TRUE, pval = TRUE, risk.table = TRUE)
ggsave("figures/figA4_km.pdf")
fit_cox <- coxph(Surv(time_at_risk, event_5y) ~ A + age + edu + smoke + bmi + ldl + sbp,
data = cohort, weights = w_out$weights)
HR <- exp(coef(fit_cox)["A"]); HR_CI <- exp(confint(fit_cox)["A", ])
fit_aft <- flexsurvreg(Surv(time_at_risk, event_5y) ~ A + age + edu + smoke + bmi + ldl + sbp,
data = cohort, dist = "weibull")
library(survRM2)
rmst <- rmst2(cohort$time_at_risk, cohort$event_5y, cohort$A, tau = 365 * 5)
A.5 Mendelian randomization (IVW / Egger / weighted-median triplet)
library(MendelianRandomization)
mri <- mr_input(bx = BX, bxse = BXSE, by = BY, byse = BYSE,
exposure = "Statin use", outcome = "MI")
ivw <- mr_ivw(mri)
egger <- mr_egger(mri)
wmedian<- mr_median(mri, weighting = "weighted")
library(MRPRESSO)
mr_presso(BetaOutcome = "by", BetaExposure = "bx", SdOutcome = "byse", SdExposure = "bxse",
OUTLIERtest = TRUE, DISTORTIONtest = TRUE, data = data.frame(bx, by, bxse, byse), NbDistribution = 1000)
A.6 Robustness — E-value / bounds / principal stratification
library(EValue)
ev <- evalue(RR(1.45), lo = 1.10, hi = 1.91)
print(ev)
A.7 STROBE / TRIPOD-AI reporting checklist
Save as replication/strobe_checklist.md and tick before submission:
[ ] Eligibility criteria + dates (target-trial protocol)
[ ] Adjustment set with DAG justification (A.2)
[ ] Positivity / overlap diagnostic (A.2)
[ ] Doubly-robust triplet (IPTW + g-formula + TMLE) (A.3)
[ ] Risk difference + hazard ratio + RMST (A.3, A.4)
[ ] E-value for unmeasured confounding (A.6)
[ ] Loss-to-follow-up rate + censoring assumption (A.0)
[ ] Pre-registered protocol or analysis plan (A.0)
§B — ML Causal Inference Mode
When the user's wording flags Mode B (DML / meta-learner / causal forest / BCF / CATE / policy learning / conformal causal / fairness / 因果机器学习), the pipeline keeps Steps 1–4 and Step 8 from the Default mode, swaps Step 5 for the ML estimator stack, and adds a CATE-distribution + policy-value layer between Step 7 and Step 8.
Package footprint (install on top of the Default stack):
install.packages(c(
"DoubleML", "mlr3", "mlr3learners",
"grf",
"causalweight",
"bartCause", "bcf",
"policytree",
"conformalInference",
"fairmodels",
"pcalg", "bnlearn"
))
B.0 Train/holdout split + nuisance learner stack
library(mlr3); library(mlr3learners); library(DoubleML)
set.seed(42)
idx <- sample(seq_len(nrow(df)), size = 0.7 * nrow(df))
train <- df[idx, ]; holdout <- df[-idx, ]
ml_g <- lrn("regr.ranger", num.trees = 500, mtry = 5)
ml_m <- lrn("classif.ranger", num.trees = 500, mtry = 5)
B.1 DAG / estimand declaration (optionally LLM-assisted)
library(pcalg)
suffStat <- list(C = cor(df[, c("A","Y","X1","X2","X3","X4")]), n = nrow(df))
pc.fit <- pc(suffStat, indepTest = gaussCItest,
alpha = 0.01, labels = c("A","Y","X1","X2","X3","X4"))
plot(pc.fit, main = "PC-recovered DAG")
B.2 Estimator stack — DML · meta-learners · causal forest · BCF (Step 5 swap)
The "AER Table 2" of ML causal: a horse-race table where each column is one estimator family on the same (Y, A, X) data — readers want to see DML, T-learner, causal forest, and BCF all agree (or disagree) on the ATE.
dml_data <- DoubleMLData$new(train, y_col = "Y", d_cols = "A",
x_cols = c("X1","X2","X3","X4"))
dml_plr <- DoubleMLPLR$new(dml_data, ml_g = ml_g, ml_m = ml_m, n_folds = 5)
dml_plr$fit()
ate_dml <- dml_plr$coef; ci_dml <- dml_plr$confint()
library(grf)
cf <- causal_forest(X = as.matrix(train[, c("X1","X2","X3","X4")]),
Y = train$Y, W = train$A, num.trees = 2000)
ate_cf <- average_treatment_effect(cf, target.sample = "all")
cate_cf <- predict(cf, newdata = as.matrix(holdout[, c("X1","X2","X3","X4")]))$predictions
library(causalweight)
dr <- treatDML(y = train$Y, d = train$A, x = as.matrix(train[, c("X1","X2","X3","X4")]),
MLmethod = "lasso")$effect
ate_DR <- mean(dr)
library(bcf)
bcf_fit <- bcf(y = train$Y, z = train$A,
x_control = as.matrix(train[, c("X1","X2","X3","X4")]),
x_moderate = as.matrix(train[, c("X1","X2","X3","X4")]),
pihat = predict(glm(A ~ ., data = train[, c("A","X1","X2","X3","X4")], family = binomial), type = "response"),
nburn = 1000, nsim = 1000)
ate_bcf <- mean(bcf_fit$tau)
library(modelsummary)
tableB2 <- tibble::tribble(
~Estimator, ~ATE,
"DML (PLR)", ate_dml[1],
"Causal Forest", ate_cf[1],
"DR-learner", ate_DR,
"Bayesian Causal Forest", ate_bcf
)
modelsummary::datasummary_df(tableB2, fmt = 4, output = "tables/tableB2_ml_horserace.tex")
B.3 CATE distribution + subgroup CATE plot (Step 7 extension)
library(ggplot2)
data.frame(cate = cate_cf) |>
ggplot(aes(x = cate)) +
geom_histogram(bins = 30, fill = "grey70", colour = "black") +
geom_vline(xintercept = 0, lty = 2) +
labs(x = "CATE", y = "Count")
ggsave("figures/figB3_cate_hist.pdf")
holdout |>
mutate(cate = cate_cf, age_q = ntile(X1, 4)) |>
group_by(age_q) |>
summarise(mean_cate = mean(cate)) |>
ggplot(aes(age_q, mean_cate)) + geom_col() + labs(y = "Mean CATE")
ggsave("figures/figB3_cate_by_age_q.pdf")
B.4 Policy learning + off-policy evaluation
library(policytree)
dr_scores <- double_robust_scores(cf)
ptree <- policy_tree(X = as.matrix(train[, c("X1","X2","X3","X4")]),
Gamma = dr_scores, depth = 3)
print(ptree)
plot(ptree)
ggsave("figures/figB4_policy_tree.pdf")
holdout_X <- as.matrix(holdout[, c("X1","X2","X3","X4")])
pred_pol <- predict(ptree, holdout_X)
DR_holdout <- double_robust_scores(cf, newdata = holdout_X)
policy_value_DR <- mean(DR_holdout[cbind(seq_len(nrow(DR_holdout)), pred_pol)])
cat(sprintf("DR policy value (holdout): %.3f\n", policy_value_DR))
B.5 Uncertainty (conformal causal) + fairness + sensitivity
library(cfcausal)
ci90 <- conformalIte(X = as.matrix(train[, c("X1","X2","X3","X4")]),
Y = train$Y, T = train$A,
alpha = 0.1,
algo = "nest",
type = "CQR",
X.test = as.matrix(holdout[, c("X1","X2","X3","X4")]))
library(fairmodels)
fobject <- fairness_check(model_treated = predict(ptree, holdout_X),
data = holdout, protected = holdout$sensitive_attr,
privileged = "majority")
plot(fobject)
ggsave("figures/figB5_fairness.pdf")
B.6 ML-causal-specific reporting checklist
Save as replication/ml_causal_checklist.md:
[ ] Nuisance learners listed (Q model, g model, hyperparameters, CV folds)
[ ] Cross-fitting / sample-splitting documented (DML K-fold)
[ ] Overlap / propensity diagnostics (B.0 + A.2-style overlap plot)
[ ] CATE summary (mean, SD, quartiles) + heterogeneity p-value (grf::test_calibration)
[ ] Policy value with confidence interval (B.4)
[ ] Conformal coverage rate on holdout (B.5)
[ ] Fairness gaps across sensitive attributes (B.5)
[ ] DAG / adjustment set + sensitivity to unmeasured confounding (E-value or Manski bounds)
Library cheat-sheet
| Step | Task | Go-to package | Fallback |
|---|
| 1 | Read data | haven / readr / readxl / data.table::fread | arrow for Parquet |
| 1 | Clean names | janitor::clean_names | manual |
| 1 | Missing | naniar / mice | Hmisc |
| 2 | Winsorize | DescTools::Winsorize | manual pmin/pmax |
| 2 | Lag in panel | dplyr::lag (with arrange+group_by) | data.table::shift |
| 3 | Table 1 | gtsummary / modelsummary::datasummary_balance | tableone |
| 3 | Correlation | psych::corr.test + corrplot | Hmisc::rcorr |
| 4 | Hetero / autocorr | lmtest::bptest / dwtest / bgtest | car |
| 4 | Panel tests | plm::pbgtest / pcdtest / phtest | — |
| 4 | Stationarity | tseries::adf.test / tseries::kpss.test | urca |
| 5 | OLS / panel FE | fixest::feols | lfe::felm (older) |
| 5 | IV | fixest::feols(| ~ ) | AER::ivreg / ivreg::ivreg |
| 5 | DID — 2×2 | feols with i(treated, post) | — |
| 5 | DID — CS | did::att_gt | — |
| 5 | DID — SA | fixest::sunab | — |
| 5 | DID — BJS | didimputation::did_imputation | — |
| 5 | DID — SDID | synthdid | — |
| 5 | RD | rdrobust / rddensity / rdmulti | — |
| 5 | SC | Synth / gsynth / tidysynth | — |
| 5 | PSM | MatchIt::matchit | — |
| 5 | IPW | WeightIt::weightit | — |
| 5 | Entropy balance | ebal | — |
| 5 | DML | DoubleML | — |
| 5 | CATE (causal forest) | grf::causal_forest | — |
| 5 | Mediation | mediation::mediate | lavaan |
| 6 | Wild cluster boot | fwildclusterboot::boottest | clubSandwich |
| 6 | Random. inference | ri2::conduct_ri | manual boot |
| 6 | Multiple testing | multcomp / hand-roll Romano-Wolf | — |
| 6 | TWFE diagnosis | bacondecomp::bacon | — |
| 6 | PT sensitivity | HonestDiD | — |
| 6 | Oster δ* | robomit::o_test / o_beta | — |
| 7 | Margins / slopes | marginaleffects::avg_slopes / plot_slopes | — |
| 7 | Mediation w/ sensitivity | mediation::mediate + medsens | — |
| 7 | SEM | lavaan::sem | — |
| 8 | Reg table (any format) | modelsummary | texreg / stargazer |
| 8 | Word table | flextable / gt::gtsave | officer |
| 8 | LaTeX table styling | kableExtra | — |
| 8 | Coefplot / event study | modelplot / fixest::iplot | ggplot2 manual |
| 8 | Binscatter | binsreg | — |
| 8 | Multi-panel | cowplot::plot_grid / patchwork | gridExtra |
Common mistakes (and what to do instead)
| Mistake | Correct approach |
|---|
lm(y ~ x + factor(unit) + factor(year)) on big panels | `feols(y ~ x |
| Default iid SEs on clustered data | feols(..., cluster = ~ id); boottest if clusters < 50 |
| TWFE on staggered adoption | did::att_gt / fixest::sunab / didimputation::did_imputation |
Using lag(x) without arrange() + group_by() | always arrange(id, time) %>% group_by(id) %>% mutate(x_l1 = lag(x)) |
| Joining without checking row count | use relationship arg in dplyr::*_join, then stopifnot(nrow(df) == n_before) |
| Interpreting logit coefficients directly | marginaleffects::avg_slopes(model) for AME |
| Reporting only point estimates | always plot — modelplot, iplot, plot_slopes |
| Manually formatting reg tables | modelsummary writes LaTeX/Word/HTML in one call |
| Reporting only the headline coefficient (no Table 2) | Always ship the multi-column M1→M6 main table — that is the centerpiece of an economics paper, not the abstract sentence |
| Coefficient table without any figures | An economics result needs at least F1 trend + F2 event study + F3 coefplot + F4 sensitivity — see the Default Output Spec |
Saving plots as .png only | also .pdf for LaTeX submissions |
| Hard-coding dataset paths in scripts | use here::here() and renv::init() |
| Running tests manually each time | wrap into targets::tar_make() or Quarto |
Typical project skeleton
project/
├── R/
│ ├── 01_clean.R # produces data/analysis.rds
│ ├── 02_transform.R
│ ├── 03_describe.R
│ ├── 04_diagnose.R
│ ├── 05_model.R # saves models to estimates/
│ ├── 06_robust.R
│ ├── 07_further.R
│ └── 08_tables_figures.R
├── data/
│ ├── raw/
│ └── analysis.rds
├── tables/
├── figures/
├── estimates/ # saved fixest objects via saveRDS
├── logs/
├── renv.lock # package versions locked
├── _targets.R # or main.qmd / main.R
└── README.md
_targets.R (using targets package) or main.qmd (Quarto) at the top makes the whole pipeline reproducible:
source("R/01_clean.R")
source("R/02_transform.R")
source("R/03_describe.R")
source("R/04_diagnose.R")
source("R/05_model.R")
source("R/06_robust.R")
source("R/07_further.R")
source("R/08_tables_figures.R")
For Quarto authoring (combined narrative + code + tables/figures, render to PDF/HTML/Word), see references/08-tables-plots.md §12.
Regtable (modelsummary / etable) cookbook (one-page recipe index)
modelsummary(...) and fixest::etable(...) are the two primitives behind every multi-regression table. The eight patterns above map to:
| Pattern | What varies across columns | Step |
|---|
| A. Progressive controls | covariate set / FE depth | 5.A — Table 2 |
| B. Design horse race | identification strategy (OLS / IV / DID / DML / PSM) | 5.B — Table 2-bis |
| C. Multi-outcome | dependent variable Y | 5.C — Table 2-ter |
| D. Stacked Panel A / B | horizon / sample (panel rows × spec columns) | 5.D — Table 2-quater |
| E. IV reporting triplet | first stage / reduced form / 2SLS | 5.E — Table 2-quinto |
| F. Causal-orchestrator | 1 column, full diagnostics (att_gt / synthdid / causal_forest) | 5.F |
| G. Subgroup table | subsample (full / female / male / Q1…Q4) | 7 — Table 3 |
| H. Robustness master | every robustness check stacked | 6.j — Table A1 |
Default modelsummary settings for AER house style:
modelsummary(
list("(1)" = m1, ..., "(N)" = mN),
output = "tables/tableN.tex",
stars = c("*" = 0.1, "**" = 0.05, "***" = 0.01),
gof_omit = "BIC|AIC|F|Log|Adj",
coef_map = c("training" = "Training"),
notes = c("Cluster-robust SE in parentheses.",
"* p<0.10, ** p<0.05, *** p<0.01.")
)
Figure factory (the 12 standard AER figures in R)
| # | Figure | R commands | Section |
|---|
| 1a | Raw trends (DID Figure 1) | df %>% group_by(year, treat) %>% summarise(mean(y)) %>% ggplot() | §1 |
| 1b | Treatment rollout heatmap | panelView::panelview(...) · ggplot + geom_tile | §1 |
| 2a | Event-study coefficients | `fixest::iplot(feols(y ~ sunab(G, t) | i + t))` |
| 2a' | Bacon weights | bacondecomp::bacon + ggplot | §3 |
| 2a'' | CS-DID dynamic effects | did::ggdid(aggte(cs, type="dynamic")) | §3 |
| 2b | First-stage scatter | binsreg::binsreg(y=D, x=Z, w=X) | §3 (Step 3.5.2) |
| 2c | RD canonical plot | rdrobust::rdplot(y, x, c=0) | §3 (Step 3.5.3) |
| 2c' | McCrary density | rddensity::rdplotdensity(rdd, X) | §3 |
| 2d | Matching love plot | cobalt::love.plot(MatchIt::matchit(...)) | §3 (Step 3.5.4) |
| 2e | SCM trajectory | tidysynth::plot_trends · synthdid::plot · Synth::path.plot | §3 (Step 3.5.5) |
| 3 | Coefficient plot of main specs | modelsummary::modelplot(list(m1,...,m6), coefs="training") | §4 |
| 4a | Dose-response | marginaleffects::plot_predictions(model, condition="dose") | §5 |
| 4b | CATE distribution | grf::causal_forest(...) + ggplot::geom_histogram(predict(cf)$predictions) | §5 |
| 5 | Specification curve | specr::plot(specr(...)) (see 6.k) | §7 |
| 6 | Sensitivity dashboard | HonestDiD::createSensitivityPlot · EValue::evalue | §7 (Step 6.l) |
| 7 | Final main figure | estimator-specific (rdplot, iplot, Synth::path.plot) | §8 |
Every figure is exported via ggsave() as both .pdf (for LaTeX) and .png ≥ 300 dpi (for slides / web). Set theme_set(theme_classic(base_size = 11)) once at the top of master.R for consistent styling.
Method Catalog
Classical OLS / Panel
library(fixest); library(plm); library(sandwich); library(lmtest)
feols(y ~ X, data = df, cluster = ~ i)
feols(y ~ X | fe1, data = df, cluster = ~ i)
feols(y ~ X | fe1 + fe2, data = df, cluster = ~ i)
feols(y ~ X | fe1 + fe2, data = df, cluster = ~ fe1 + fe2)
fepois(count ~ X | fe1 + fe2, data = df, cluster = ~ i)
feglm (y ~ X | fe1, data = df, family = binomial(link = "logit"),
cluster = ~ i)
plm (y ~ X, data = df, model = "within", index = c("i","t"))
plm (y ~ X, data = df, model = "random", index = c("i","t"))
Difference-in-Differences
library(fixest); library(did); library(didimputation); library(synthdid); library(bacondecomp); library(HonestDiD); library(DIDmultiplegtDYN)
feols(y ~ i(treated, post, ref = 0) | i + t, df, cluster = ~ i)
feols(y ~ sunab(first_treat, year) | i + year, df, cluster = ~ i)
att_gt(yname="y", tname="t", idname="i", gname="G", data=df,
control_group="nevertreated", est_method="dr", clustervars="i")
did_imputation(data=df, yname="y", gname="G", tname="t", idname="i",
horizon=0:5, pretrends=-5:-1, cluster_var="i")
DIDmultiplegtDYN(df, "y", "i", "t", "training", effects=5, placebo=3)
synthdid_estimate(panel.matrices(df,"i","t","y","training"), ...)
bacon(y ~ training, data=df, id_var="i", time_var="t")
HonestDiD::createSensitivityResults(...)
Instrumental Variables / 2SLS
library(fixest); library(AER); library(ivreg)
feols(y ~ X | D ~ Z, df, cluster = ~ firm_id)
fitstat(iv, ~ ivf + ivwald + sargan + cd)
AER::ivreg(y ~ D + X | Z + X, data = df)
summary(iv, vcov. = sandwich, diagnostics = TRUE)
Regression Discontinuity
library(rdrobust); library(rddensity); library(rdmulti)
rdrobust(y, x, c = 0, kernel = "triangular", bwselect = "mserd")
rdrobust(y, x, c = 0, fuzzy = D)
rddensity(X = x, c = 0)
rdplot(y, x, c = 0)
rdmc(y, x, cutoffs = c(0, 5, 10))
Matching / Reweighting
library(MatchIt); library(WeightIt); library(cobalt)
matchit (D ~ X1 + X2, data = df, method = "nearest", ratio = 1)
matchit (D ~ X1 + X2, data = df, method = "cem")
weightit(D ~ X1 + X2, data = df, method = "ebal")
weightit(D ~ X1 + X2, data = df, method = "ps", estimand = "ATE")
love.plot(matchit_obj, threshold = 0.10)
Synthetic Control
library(Synth); library(gsynth); library(tidysynth); library(synthdid)
Synth::synth(...)
gsynth(y ~ training, data = df, index = c("i","t"), force = "two-way")
synthdid_estimate(panel.matrices(...))
tidysynth::synthetic_control(df, ...) %>% generate_predictor(...) %>%
generate_weights() %>% generate_control()
ML Causal (Mode B — see §B)
library(grf); library(DoubleML); library(mlr3); library(causalDML)
causal_forest(X, Y, W, num.trees = 4000, honesty = TRUE)
DoubleML::DoubleMLPLR$new(data, ml_l = lrn("regr.ranger"),
ml_m = lrn("regr.ranger"))
DoubleML::DoubleMLIRM$new(data, ...)
predict(cf)$predictions
average_treatment_effect(cf, target.sample = "treated")
test_calibration(cf); variable_importance(cf)
policytree::policy_tree(X, gamma, depth = 3)
Robustness, Sensitivity & Inference
library(fwildclusterboot); library(ri2); library(multcomp); library(robomit); library(EValue)
boottest(model, param = "training", clustid = "state", B = 9999)
ri2::conduct_ri(...)
robomit::o_test(...)
EValue::evalue(RR(1.45), lo = 1.10, hi = 1.91)
fwildclusterboot::boottest(..., type = "rademacher")
Survival / Epi (Mode A — see §A)
library(survival); library(survminer); library(survRM2); library(ipw); library(tmle); library(zelig)
survfit(Surv(time, event) ~ A, data = df)
coxph (Surv(time, event) ~ A + X, data = df)
survreg(Surv(time, event) ~ A + X, data = df, dist = "weibull")
rmst2 (time, status, arm, tau = 1825)
ipw::ipwpoint(...)
tmle (Y, A, W = X, ...)
gfoRmula::gformula_survival(...)
TwoSampleMR::mr(...)
When to hand off to other skills
- Agent-native single-import Python workflow (
import statspai as sp) → 00-StatsPAI_skill.
- Explicit Python traditional stack →
00.1-Full-empirical-analysis-skill.
- Stata
.do pipeline → 00.2-Full-empirical-analysis-skill_Stata.
- Cross-language Mixtape templates (Python/R/Stata side-by-side) →
10-Jill0099-causal-inference-mixtape.
- Bayesian R workflow (
brms/rstan/cmdstanr) → 23-Learning-Bayesian-Statistics-baygent-skills.
- Paper drafting after analysis → the writing skills in this repo.
This skill ends at Step 8 — .tex / .docx tables and .pdf figures. Paper drafting is out of scope.