Write a hypothesis that is economic, testable, and falsifiable. This IS Section 1 of your Investment Proposal.
A strong hypothesis has three components: (1) an economic mechanism explaining why the inefficiency exists, (2) the predicted relationship between signal and return, and (3) a falsifiability condition. This week teaches you to write all three, and walks through hypothesis types with real examples.
Why does this pattern exist? Not "I found it in the data." Why would market participants allow this inefficiency to persist?
Good mechanisms explain:
What signal predicts what return, in what direction, over what horizon?
Example: "When 30-day rolling soil moisture deviation is more than 1 standard deviation below the 20-year seasonal average, corn futures are expected to have negative returns over the next 5–10 trading days because low soil moisture predicts lower yields, which the market will reprice when USDA yield reports are released."
Specify:
What result would convince you the hypothesis is wrong? If you can't answer this, you don't have a hypothesis.
Example: "This hypothesis is falsified if (a) the Information Coefficient between soil moisture deviation and forward corn returns is less than 0.02 (no predictive power), or (b) the backtest Sharpe on a 2023 hold-out period is less than 0.5 (results don't hold out-of-sample)."
Good falsifiability conditions are:
Copy this template into Section 1 of your IP. Fill in each section with your hypothesis.
HYPOTHESIS TEMPLATE
Economic mechanism:
[Market participants systematically _____ because _____, which causes prices to _____ .]
Predicted relationship:
[When [signal] is [high/low/rising/falling], [asset] returns are expected to be [positive/negative] over [horizon], because [mechanism above].]
Falsifiability:
[This hypothesis would be rejected if [specific quantitative condition — e.g., Information Coefficient < 0.02, or Sharpe < 0.5 on hold-out period, or directional accuracy < 55% on out-of-sample data].]
Example: Post-earnings drift (momentum)
Economic mechanism: Investors systematically underreact to earnings news. The market reprices gradually over weeks or months, not immediately at announcement. Short-selling constraints and institutional limitations on volatility exposure prevent arbitrageurs from eliminating this drift immediately.
Predicted relationship: When earnings surprise is large and positive (actual EPS > consensus EPS by more than 1 standard deviation), the stock is expected to have positive abnormal returns for 3–6 months post-announcement.
Falsifiability: This hypothesis is rejected if the average post-announcement drift is not significantly different from zero (t-stat < 1.96) over a 3-month hold period, or if the drift reverses (negative returns) in the first week after announcement.
Example: Commodity roll yield (contango/backwardation)
Economic mechanism: Commodity futures contracts are rolled before expiry. When the term structure is in backwardation (near-month contract trading at a premium to far-month), rolling captures positive yield. This is not an arbitrage — it reflects the physical convenience value of holding inventory. Hedgers (producers and consumers) are willing to pay this premium because holding physical commodity has value. The premium is persistent because the underlying convenience value doesn't disappear.
Predicted relationship: When the front/second-month spread is in backwardation (front price > second-month price), a rolling long position in the front-month contract is expected to earn positive roll yield equal to the spread minus the cost of financing. Expected return is positive.
Falsifiability: This hypothesis is rejected if average roll yield is zero or negative over a 5-year period, or if roll yield does not persist after transaction costs.
Example: Satellite weather data predicting crop yields
Economic mechanism: NASA satellite data provides daily measurements of soil moisture and temperature at the field level. These measurements predict crop yields. The market prices crops based on USDA forecasts, which are survey-based and release once per month. There is an information gap: satellite data predicts yields before USDA reports them. The market does not instantaneously incorporate satellite data because (a) the data is not in standard market feeds, (b) processing it requires domain expertise, (c) few market participants have access or motivation to use it.
Predicted relationship: When growing-season soil moisture anomalies (satellite GDD deviation from 20-year seasonal average) are large and negative, corn prices are expected to decline in the 5–10 days preceding the next USDA yield report, capturing the market's repricing as the report data becomes known.
Falsifiability: This hypothesis is rejected if the Information Coefficient between satellite moisture deviation and forward corn returns is < 0.02, or if the strategy has negative Sharpe on a 2023 hold-out period.
IC measures the correlation between your signal and forward returns. It's the core metric for evaluating whether your hypothesis has predictive power before you build the full backtest.
\[ IC = \text{Corr}(\text{signal}_t,\ r_{t+h}) \]Where signal_t is your signal at time t, r_{t+h} is the return from t to t+h (your chosen horizon).
Interpretation:
IC Information Ratio (ICIR):
\[ ICIR = \frac{\overline{IC}}{\sigma_{IC}} \]Average IC divided by the standard deviation of IC. ICIR > 0.5 suggests a consistently predictive signal.