Week 9: Model Assumptions (IP §3b)

What this week covers

This is the most technically detailed week. Section 3b is a direct reference for completing the assumption-testing portion of every IP. You will learn to test five critical assumptions (covered in Week 5), plus the special case of cointegration for pairs trading. Every test, result, and implication goes in Section 3b.

Section 3b checklist

Copy this checklist into your IP. Complete every row. Report the test, the result, and what you'll do if it fails.

Assumption	Test	Python	Fail condition	If it fails
Normality of residuals	Jarque-Bera, Shapiro-Wilk	`scipy.stats.jarque_bera`	p < 0.05	Use robust methods; report fat-tail risk
Stationarity	ADF, KPSS	`statsmodels.adfuller`	ADF p > 0.05	Difference series; use returns not prices
No autocorrelation	Ljung-Box	`statsmodels.acorr_ljungbox`	p < 0.05	Use Newey-West HAC standard errors
Homoscedasticity	Breusch-Pagan	`statsmodels.het_breuschpagan`	p < 0.05	Use HC3/HC4 robust SE or GARCH model
No multicollinearity	VIF	`statsmodels.variance_inflation_factor`	VIF > 10	Drop or combine correlated features

Deep dive: Cointegration (pairs trading)

For pairs strategies, you're not testing stationarity of individual series — you're testing stationarity of the spread (the linear combination).

Engle-Granger cointegration test

Two series are cointegrated if a linear combination of them is stationary.

Method:

Regress y on x: \(y_t = \alpha + \beta x_t + \varepsilon_t\)
Extract residuals ε
Test residuals for stationarity (ADF test)
If residuals are stationary, the pair is cointegrated

Python:

from statsmodels.tsa.stattools import coint

score, p_value, critical_values = coint(series_a, series_b)
print(f"Cointegration test p-value: {p_value:.4f}")
# p < 0.05 → cointegrated at 5% significance

Johansen test (multiple assets)

For strategies with 3+ assets, use Johansen's cointegration test.

\[ \Delta \mathbf{y}_t = \Pi \mathbf{y}_{t-1} + \sum_{i=1}^{p-1} \Gamma_i \Delta \mathbf{y}_{t-i} + \varepsilon_t \]

Interpretation: The rank of Π determines the number of cointegrating relationships (linearly independent stationary combinations).

Section 3b template with real example

SECTION 3B: MODEL ASSUMPTIONS — Ridge Regression Commodity Strategy

Strategy uses a ridge regression to predict corn returns from 3 factors:

Factor 1: 30-day momentum
Factor 2: 60-day seasonality (deviation from 10-year average)
Factor 3: COT speculator positioning (normalized)

3b.1 Normality of residuals

Test: Jarque-Bera on regression residuals (2020–2023 in-sample)

Result: JB stat = 12.4, p-value = 0.002 → reject normality

Implication: Residuals have fat tails. Daily returns exhibit kurtosis = 4.2 (excess = 1.2). Documented.

Action: Report expected maximum drawdown assuming fat tails. Use robust standard errors (HC3).

3b.2 Stationarity of predictors

Test: ADF on each factor

Momentum (30-day): ADF stat = -8.7, p < 0.001 → stationary ✓
Seasonality (deviation): ADF stat = -9.2, p < 0.001 → stationary ✓
COT positioning (normalized): ADF stat = -6.4, p < 0.001 → stationary ✓

All factors are stationary. No issues.

3b.3 No autocorrelation in residuals

Test: Ljung-Box on residuals, lags 1–20

Result: All p-values > 0.05. No significant autocorrelation detected. ✓

3b.4 Homoscedasticity

Test: Breusch-Pagan heteroscedasticity test

Result: BP stat = 18.3, p = 0.0003 → heteroscedasticity present

Action: Volatility clusters (expected in commodity markets). Use HC3 robust standard errors for inference.

3b.5 No multicollinearity

Test: Variance Inflation Factor on 3 factors

Momentum: VIF = 1.3
Seasonality: VIF = 1.1
COT: VIF = 1.2

All VIF < 2. No multicollinearity concerns. ✓

Summary: Ridge regression suitable. Primary concerns: fat-tail risk (documented), heteroscedasticity (robust SE applied). Model assumptions documented and acceptable for live trading.

Assumption testing code patterns

Normality (Jarque-Bera)

from scipy import stats

jb_stat, jb_p = stats.jarque_bera(residuals)
print(f"Jarque-Bera: stat={jb_stat:.4f}, p={jb_p:.4f}")

if jb_p < 0.05:
    print("Residuals are NOT normally distributed (fat tails likely)")
else:
    print("Residuals are consistent with normality")

Stationarity (ADF)

from statsmodels.tsa.stattools import adfuller

result = adfuller(series, autolag='AIC')
print(f"ADF stat: {result[0]:.4f}, p-value: {result[1]:.4f}")

if result[1] < 0.05:
    print("Series is stationary (reject unit root)")
else:
    print("Series is non-stationary (unit root present)")

Multicollinearity (VIF)

from statsmodels.stats.outliers_influence import variance_inflation_factor
import pandas as pd

vif_data = pd.DataFrame()
vif_data["Feature"] = X.columns
vif_data["VIF"] = [
    variance_inflation_factor(X.values, i)
    for i in range(X.shape[1])
]
print(vif_data)

if (vif_data["VIF"] > 10).any():
    print("Multicollinearity detected. Consider dropping features.")

Common mistakes

Five assumption-testing failures

Skipping Section 3b entirely. It's mandatory. At minimum: test residuals for normality and autocorrelation, test series for stationarity. Report results.
Running ADF but not reporting it in Section 3b. If you ran the test, report it. The p-value, the interpretation, what you'll do if it fails.
Building a multi-factor model without checking VIF. Correlated factors will cancel each other out in live trading, even if they work in backtest. Check VIF before submitting.
Ignoring fat tails. "Close enough to normal" is not a valid analysis. If Jarque-Bera p < 0.05, document the deviation. Report expected extreme drawdown.
Using price levels in regression. Prices are non-stationary. Test with ADF first. If non-stationary, use returns or cointegration.

Model Assumptions(IP §3b)