Run any Skill in Manus with one click

iv-estimation

Econometrics skill for instrumental variables and treatment effect estimation. Activates when the user asks about: "instrumental variables", "IV estimation", "2SLS", "two-stage least squares", "endogeneity", "weak instruments", "first stage", "Sargan test", "overidentification", "propensity score matching", "PSM", "average treatment effect", "ATT", "LATE", "local average treatment effect", "endogenous regressor", "instrument validity", "工具变量", "两阶段最小二乘", "内生性", "弱工具变量", "倾向得分匹配", "平均处理效应", "处理效应", "局部平均处理效应"

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/zhouziyue233/great-econometrics --skill iv-estimation

Copy and paste this command into Claude Code to install the skill

Source

zhouziyue233/great-econometrics

Stars4

Forks0

UpdatedApril 3, 2026 at 04:39

File Explorer

2 files

SKILL.md

readonly

More from this repository

same repository

literature-review

zhouziyue233/great-econometrics

Search, summarize, and synthesize economics literature. find research gaps, position your contribution.

2026-04-074

beamer-ppt

zhouziyue233/great-econometrics

Create Beamer-style academic PPTX presentations using python-pptx. Produces publication-quality .pptx files with navy-blue Metropolis theme (16:9, frame title bars, progress bar) for conference talks, job market presentations, and seminar slides. Called by /present command.

2026-04-034

data-pipeline

zhouziyue233/great-econometrics

End-to-end data pipeline for empirical research: fetch economic data from APIs (FRED, World Bank, IMF, BLS, OECD, Yahoo Finance), clean and transform raw data, construct strategy-specific variables, and validate panel structure. Use when asked to fetch data, download data, clean data, merge datasets, prepare analysis-ready data.

2026-04-034

did-analysis

zhouziyue233/great-econometrics

Econometrics skill for Difference-in-Differences (DID) analysis. Activates when the user asks about: "difference in differences", "DID", "DiD", "diff-in-diff", "parallel trends", "treatment group", "control group", "pre-treatment", "post-treatment", "policy evaluation", "natural experiment", "staggered DID", "event study regression", "two-way fixed effects DID", "callaway santanna", "sun and abraham", "双重差分", "倍差法", "平行趋势", "处理组", "对照组", "政策评估", "事件研究", "交错DID", "渐进处理"

2026-04-034

figure

zhouziyue233/great-econometrics

Called by /plot to generate and upgrade econometric figures to top-journal standards.

2026-04-034

ml-causal

zhouziyue233/great-econometrics

Econometrics skill for machine learning methods in causal inference. Activates when the user asks about: "causal forest", "generalized random forest", "GRF", "double machine learning", "DML", "debiased machine learning", "LASSO for variable selection", "post-LASSO", "heterogeneous treatment effects", "CATE", "conditional average treatment effect", "BLP analysis", "CLAN analysis", "causal tree", "honest estimation", "因果森林", "双重机器学习", "异质性处理效应", "条件平均处理效应", "LASSO变量选择", "机器学习因果推断", "去偏机器学习"

2026-04-034

Source

zhouziyue233

zhouziyue233/great-econometrics

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Data ScientistsComputer and Mathematical Occupations15-2051L4

name

iv-estimation

description

Instrumental Variables & Treatment Effects Skill

This skill covers IV/2SLS estimation and propensity score matching (PSM) for causal inference when treatment is endogenous. It helps identify valid instruments, run 2SLS, test instrument validity, and implement PSM.

When to Use IV vs PSM

Method	Use When
IV / 2SLS	Treatment is endogenous; a valid instrument exists
PSM	Selection on observables assumption is credible; rich covariate data
OLS + controls	Selection on observables, limited instruments

IV / 2SLS Framework

Conditions for a Valid Instrument Z for endogenous X

Relevance: Cov(Z, X) ≠ 0 — Z must be correlated with the endogenous regressor
Exclusion restriction: Cov(Z, ε) = 0 — Z affects Y only through X (cannot be tested directly)
Independence: Z is as-good-as-randomly assigned (exogenous)

Two-Stage Least Squares Procedure

Stage 1: Regress endogenous X on instruments Z and exogenous controls W

X̂ = γ₀ + γ₁Z + γ₂W + v
Check F-statistic > 10 (Stock-Yogo rule of thumb); ideally > 16.4 (5% bias threshold)

Stage 2: Regress Y on predicted X̂ and controls W

Y = β₀ + β₁X̂ + β₂W + ε
SE must be corrected for the two-stage estimation (done automatically by software)

Quick Code Templates

# Python (linearmodels)
from linearmodels.iv import IV2SLS

# Formula: dependent ~ exogenous [endogenous ~ instruments]
model = IV2SLS.from_formula(
    'y ~ 1 + w1 + w2 + [x_endog ~ z1 + z2]', data=df
)
result = model.fit(cov_type='robust')
print(result.summary)

# First-stage diagnostics
print(result.first_stage.diagnostics)
# Check: partial F-stat, Shea partial R²

# R (AER)
library(AER)
iv_model <- ivreg(y ~ x_endog + w1 + w2 | z1 + z2 + w1 + w2, data = df)
summary(iv_model, diagnostics = TRUE)
# Shows: weak instruments F-test, Wu-Hausman endogeneity test, Sargan overID test

* Stata
ivregress 2sls y w1 w2 (x_endog = z1 z2), robust first
estat firststage      // First-stage diagnostics
estat endogenous      // Wu-Hausman test
estat overid          // Sargan-Hansen overidentification test

Key Diagnostic Tests

Test	Null Hypothesis	Interpretation
First-stage F-stat	Instruments are weak	F > 10 → relevant instruments
Wu-Hausman	X is exogenous (OLS consistent)	p < 0.05 → endogeneity confirmed, use IV
Sargan-Hansen	All instruments valid (overID only)	p > 0.05 → instruments pass overID test
Anderson-Rubin	Robust to weak instruments	Use when F-stat is borderline

Propensity Score Matching (PSM)

Assumptions

Conditional independence (unconfoundedness): Treatment T ⊥ Y(0), Y(1) | X
Common support (overlap): 0 < P(T=1|X) < 1 for all X

PSM Procedure

# Python
from sklearn.linear_model import LogisticRegression
import numpy as np

# Step 1: Estimate propensity scores
lr = LogisticRegression(max_iter=1000)
lr.fit(df[covariates], df['treatment'])
df['pscore'] = lr.predict_proba(df[covariates])[:, 1]

# Step 2: Check common support
import matplotlib.pyplot as plt
df.groupby('treatment')['pscore'].plot.hist(alpha=0.5, bins=30)

# Step 3: Match (nearest neighbor, 1:1 without replacement)
treated = df[df['treatment'] == 1].copy()
control = df[df['treatment'] == 0].copy()

from sklearn.neighbors import NearestNeighbors
nn = NearestNeighbors(n_neighbors=1)
nn.fit(control[['pscore']])
distances, indices = nn.kneighbors(treated[['pscore']])

matched_control = control.iloc[indices.flatten()].copy()
matched_df = pd.concat([treated, matched_control])

# Step 4: Estimate ATT
att = matched_df.groupby('treatment')['y'].mean().diff().iloc[-1]
print(f"ATT: {att:.4f}")

# R (MatchIt)
library(MatchIt)
match_out <- matchit(treatment ~ x1 + x2 + x3, data = df,
                     method = "nearest", ratio = 1, replace = FALSE)
summary(match_out)

# Covariate balance
plot(match_out, type = "jitter")
plot(summary(match_out))

# Estimate ATT
matched_data <- match.data(match_out)
att_model <- lm(y ~ treatment, data = matched_data, weights = weights)
coeftest(att_model, vcov = vcovCL(att_model, ~subclass))

* Stata (psmatch2 from SSC)
psmatch2 treatment x1 x2 x3, outcome(y) neighbor(1) common
pstest x1 x2 x3

Reporting IV Results

Always show first-stage results with F-statistic
Report OLS alongside IV to illustrate endogeneity bias direction
State the exclusion restriction argument explicitly — this cannot be statistically tested
Interpret LATE not ATE: IV estimates are local to compliers (those induced by instrument)
Overidentification test: report Sargan p-value when instruments > endogenous regressors

For weak-instrument robust inference (Anderson-Rubin confidence sets, LIML), control function approach, shift-share (Bartik) instruments, judge/examiner designs, and sensitivity analysis for PSM, see references/iv-reference.md.

Common Pitfalls

Using 2SLS with weak instruments without robust inference: When F < 10, use LIML or Anderson-Rubin confidence sets instead of 2SLS
Not arguing for exclusion restriction: The exclusion restriction cannot be tested statistically — you must make a convincing argument
Confusing LATE with ATE: IV estimates the local average treatment effect for compliers, not the population average
Clustering SE at the wrong level in Bartik IV: With shift-share instruments, inference should account for the exposure shares structure
Over-identifying without caution: Adding more instruments improves efficiency but only if all are valid — a significant Sargan test means at least one instrument is invalid
Using PSM without checking common support: If treated and control propensity score distributions barely overlap, matching is unreliable