// Expert guidance on statistical analysis methodologies and Monte Carlo simulation for fantasy football. Use this skill when selecting regression approaches, designing simulations, performing variance analysis, or conducting hypothesis tests. Covers regression types (OLS, Ridge, Lasso, GAMs), Monte Carlo frameworks, regression-to-mean analysis, and statistical best practices for player performance modeling.
| name | ff-statistical-methods |
| description | Expert guidance on statistical analysis methodologies and Monte Carlo simulation for fantasy football. Use this skill when selecting regression approaches, designing simulations, performing variance analysis, or conducting hypothesis tests. Covers regression types (OLS, Ridge, Lasso, GAMs), Monte Carlo frameworks, regression-to-mean analysis, and statistical best practices for player performance modeling. |
Provide expert guidance on statistical methodologies and simulation techniques for fantasy football analytics. Apply appropriate regression methods, design Monte Carlo simulations, perform variance analysis, and conduct hypothesis tests using research-backed approaches.
Trigger this skill for queries involving:
Note: For ML model selection and feature engineering, use ff-ml-modeling. For dynasty strategy domain knowledge, use ff-dynasty-strategy.
Decision Framework:
Linear Regression (OLS): Baseline, interpretability, small samples
Ridge (L2): Multicollinearity, keep all features, shrink coefficients
Lasso (L1): High-dimensional data, automatic feature selection, sparse models
Elastic Net: Best default for fantasy (combines Ridge + Lasso)
GAMs: Non-linear relationships (aging curves), interpretable smooth functions
Reference: references/regression_methods.md for detailed comparisons and Python code.
Applications:
Core Approach:
# Simulate player week: Normal(projection, std_dev)
simulated_points = np.random.normal(projection, std_dev, n_sims=10000)
simulated_points = np.maximum(simulated_points, 0) # Floor at zero
Key Considerations:
Reference: references/simulation_design.md for frameworks, templates, and best practices.
Asset: assets/monte_carlo_template.py - Python templates for common simulations.
Concept: Extreme values tend toward average in subsequent measurements
Fantasy Application:
Position-Specific Sample Sizes (50% regression):
Implementation:
regression_factor = sample_size / (sample_size + n_50[position])
regressed_estimate = (regression_factor * current_stat) + ((1 - regression_factor) * position_mean)
Reference: references/regression_methods.md section on regression to the mean.
Confidence Interval: Uncertainty in estimated mean (narrow)
Prediction Interval: Uncertainty for new observation (wider - use this for player projections!)
Why it matters: Individual player performance has more variability than average performance
# Prediction interval accounts for both parameter uncertainty AND individual variance
margin = t_score * residual_standard_error
lower, upper = prediction - margin, prediction + margin
When to use: Non-linear relationships like aging curves
How it works: Fit smooth spline for each feature: y = β₀ + f₁(age) + f₂(experience) + ...
Fantasy use cases:
Research finding: GAMs reveal QB peaks at 28-33, RB declines post-27
Python:
from pygam import LinearGAM, s, f
# s() = smooth (non-linear), f() = factor (categorical)
gam = LinearGAM(s(0) + s(1) + f(2)) # age, experience, position
gam.fit(X_train, y_train)
# Visualize smooth curves
gam.partial_dependence(term=0) # Age curve
Reference: references/regression_methods.md section on GAMs with Python and R code.
Step 1: Define Goal
Step 2: Check Data Characteristics
Step 3: Baseline
Step 4: Regularization
Step 5: Non-linearity
Step 1: Define Scenario
Step 2: Gather Inputs
Step 3: Build Simulation
assets/monte_carlo_template.py as starting pointStep 4: Run Simulations
Step 5: Analyze Distribution
Step 1: Identify Extreme Performers
Step 2: Check Sample Size
Step 3: Apply Regression Formula
regression_factor = n / (n + n_50)regressed = (factor * current) + ((1 - factor) * mean)Step 4: Identify Buy-Low / Sell-High
For Regression Analysis:
For Monte Carlo Simulation:
For Variance Analysis:
Complement with ff-ml-modeling when:
Complement with ff-dynasty-strategy when:
Start Simple - OLS baseline before complex methods
Regularize for High Dimensions - Use Lasso/Elastic Net when features > samples
Use GAMs for Clear Non-Linearity - Aging curves, experience effects
Model Correlations in Simulations - QB and WRs are correlated (ρ ≈ 0.6)
Sufficient Iterations - 10,000 minimum for stable estimates
Analyze Full Distribution - Percentiles and probabilities, not just means
Validate Assumptions - Plot residuals, check for patterns
Regression to Mean is Powerful - TDs regress, volume is king
Ignoring non-linearity - Age curves aren't linear, use GAMs
Too few simulations - <1,000 gives unstable estimates
Independence assumptions - Teammates are correlated
Flaw of averages - Non-linear outcomes make "average" misleading
Over-interpreting small samples - NFL has only 17 games/season
Forgetting regression to mean - Extreme TDs will regress
# Regression
from sklearn.linear_model import Ridge, Lasso, ElasticNet, LinearRegression
import statsmodels.api as sm # For statistical inference
from pygam import LinearGAM, s, f # GAMs
# Simulation
import numpy as np
import scipy.stats
# Analysis
from sklearn.metrics import mean_absolute_error, mean_squared_error
import matplotlib.pyplot as plt
references/regression_methods.md - OLS, Ridge, Lasso, Elastic Net, GAMs, regression to mean, confidence/prediction intervalsreferences/simulation_design.md - Monte Carlo frameworks, championship probability, trade impact, path dependence, error correlationassets/monte_carlo_template.py - Python templates for rest-of-season simulation, championship probability, and trade impact analysis