Sensitivity registry for PolicyEngine microsim results — maps {program × deviation signature} to the most likely calibration target or imputed variable driving the mismatch. The knowledge base that powers the calibration-diagnostics agent.
Load when investigating why a microsim result differs from a prior score, or when reviewing whether a reform classification is calibration-sensitive.
Triggers: "why does my reform not match", "policyengine cost off", "calibration mismatch", "imputed variable", "takeup rate", "non-filer", "itemizer share", "small state variance", "ECPS calibration target", "policyengine-us-data target", "calibration dashboard", "diagnose mismatch", "deviation signature".
Instalação
Instalar com Codex ou Claude Copie este prompt, cole no Codex, Claude ou outro assistente e deixe que ele revise a página da skill e instale para você.
Sensitivity registry for PolicyEngine microsim results — maps {program × deviation signature} to the most likely calibration target or imputed variable driving the mismatch. The knowledge base that powers the calibration-diagnostics agent.
Load when investigating why a microsim result differs from a prior score, or when reviewing whether a reform classification is calibration-sensitive.
Triggers: "why does my reform not match", "policyengine cost off", "calibration mismatch", "imputed variable", "takeup rate", "non-filer", "itemizer share", "small state variance", "ECPS calibration target", "policyengine-us-data target", "calibration dashboard", "diagnose mismatch", "deviation signature".
PolicyEngine Calibration Diagnostics
Converts tribal "I'd check the takeup rate first" knowledge into a structured sensitivity registry. When a /analyze-policy Stage 5 comparison fails, this skill provides the ranked candidate causes.
When to use
Stage 6 of /analyze-policy — invoked by the calibration-diagnostics agent.
Code review of microsim PRs where the headline number differs from priors.
Designing a new policyengine-us-data calibration target.
Debugging why a state-level run looks volatile.
Top-level architecture
PolicyEngine microsim results depend on three layers:
Country model logic (policyengine-us, policyengine-uk, policyengine-canada) — formulas, parameters.
Behavioral assumptions (takeup rates, labor-supply elasticities) — usually parameters but easy to overlook.
A mismatch is almost always rooted in layer 2 or 3, not layer 1 (layer 1 mismatches show up as outright simulation errors, not magnitude drift).
Sensitivity table — top programs
EITC
Calibration input
Direction of effect
Notes
Childless-adult takeup rate
Higher → more cost, more poverty reduction in deciles 1-3
Historically ~65% childless vs ~80% with-kids. If model uses uniform takeup, expansions over-state. Verify against policyengine_us/parameters/gov/irs/credits/eitc/takeup.yaml if present.
Earnings distribution in $5K-$15K band
More density → more phase-in/plateau region beneficiaries
PUF→CPS imputation is sparse for childless filers here.
Tax-unit definition for cohabiting adults
More split units → more eligible filers
Look at tax_unit_count calibration vs SOI Table 1.1.
Age cohort 19-24 workers (newly eligible)
Newly eligible under ARPA-style age expansion
Not separately calibrated in ECPS; the marginal population shows up under generic young-adult bucket.
Age cohort 65+ workers (newly eligible)
Eligible under ARPA age-cap removal
Note: this drives meaningful senior poverty reduction (e.g., -5% in live EITC test). The SKILL previously didn't surface this. Senior workers with earnings in the EITC phase-in range are NOT in the typical childless-EITC analytical lens — but they are in the ARPA reform's lens. Calibration: 65+ earners with $5-15K earned income are not a separately calibrated population; expect higher variance.
Known issue
github.com/PolicyEngine/policyengine-us/issues/4276 — total EITC over-estimate ~9% vs CBO
EITC SKILL coverage note: This row is thinner than the CTC and SALT rows. The diagnostics agent should emit coverage_note: "EITC SKILL row is partial; hypotheses below cover takeup and age-cohort but do NOT cover joint-filer marriage-penalty mechanics, investment-income limit interactions, or self-employment-income imputation. If your deviation signature touches any of these, widen the confidence band."
CTC
Calibration input
Direction of effect
Notes
Non-filer CTC takeup
Higher → much more poverty reduction (largest lever)
ARPA achieved ~90%+ via IRS portal; non-ARPA defaults ~75%.
Imputed child age distribution (0-5 vs 6-17)
More 0-5 → more cost (the $3,600 tier)
ECPS imputes; compare to ACS published by year of age.
Non-filer share of CPS
Higher → more refundability benefit
CPS undercounts non-filers; addressed via PUF imputation.
EITC interaction (refundable ordering)
Incorrect ordering → double-count
Check irs_credits_ordering parameter.
SPM threshold + state benefit offsets
State-specific
Few states tax CTC; SSI supplements vary.
SALT cap
Calibration input
Direction of effect
Notes
Itemizer share
Higher → cost balloons, benefit cascades down deciles
Post-TCJA only ~10% itemize. Targeted by salt_deduction = $21.247B in loss.py.