Week 1: What Is a Research Edge?

IP Anchor Foundation for all four IP sections Every hypothesis must explain WHY, not just THAT. This week frames the entire research process.

What this week covers

An edge is a systematic, repeatable source of alpha grounded in an economic or behavioral reason. This week separates genuine edges from coincidences, distinguishes between two types of inefficiency, clarifies what QR owns (and doesn't), and explains why the base rate of real signals is so low that you need a hypothesis BEFORE you test.

What "edge" actually means

An edge is not a pattern. An edge is a reason the pattern exists.

If you backtest 100 trading signals, roughly 5 will produce positive returns at p < 0.05 by pure chance. The ones that do are noise, not edges. The difference between noise and signal is whether you can write down the economic cause in one clear sentence.

Consider three examples:

Not an edge: "Corn prices go up in July and August." This is an observation. It might be noise. Why does it happen?
An edge: "Growing-season weather anomalies predict corn yield surprises, which are not fully priced until USDA reports are released." This explains the mechanism.
Not an edge: "The S&P 500 has positive momentum in Q4." Observation, but not a mechanism.
An edge: "Investors systematically underreact to earnings news, causing prices to drift in the direction of the surprise over 3–12 months (underreaction effect). Short-selling constraints prevent arbitrageurs from eliminating this." This explains why the pattern exists and why it persists.

The test: If you can't explain in one sentence why the pattern exists, you don't have a hypothesis — you have a coincidence.

Two types of edge

Behavioral Inefficiency

Market participants systematically misbehave. They anchor to historical prices, they have a disposition effect (reluctance to realize losses), they herd. These behavioral biases are well-documented.

Example: Momentum edge. Investors underreact to earnings surprises. Prices drift in the direction of the surprise for months.

Why it degrades: As the pattern becomes well-known, arbitrageurs move in and compress the edge. Once everyone knows about momentum, it's smaller than it was. Behavioral edges are an arms race against the field learning the same thing you did.

Structural Inefficiency

Market structure itself forces suboptimal behavior. Producers must hedge crops every year (can't avoid it). Index funds must rebalance on scheduled dates. Futures contracts require rolling (and the roll carries a cost). These are not behavioral choices — they are structural constraints.

Example: Commodity contango roll yield. Futures in contango (forward prices higher than spot) mean rolling long positions captures negative roll yield. This is structurally persistent because supply and storage costs are real.

Why it persists: Structural inefficiencies are more durable because they don't disappear when the pattern becomes known. The underlying constraint doesn't go away.

Dimension	Behavioral Inefficiency	Structural Inefficiency
Cause	Systematic misbehavior (anchoring, herding, disposition effect)	Market structure constraints (hedging, rebalancing, storage, rolling)
Example	Momentum, post-earnings drift	Roll yield, index rebalancing effects, CTA positioning
Degradation	Compresses as pattern becomes known (arms race)	Persists; constraint doesn't disappear
Arbitrage resistance	Lower (competing hedge funds can attack it)	Higher (can't eliminate the underlying constraint)
Typical Sharpe*	0.8–1.5 (live)	1.0–2.0+ (live)

What QR does and doesn't own

QR Responsibility

Hypothesis: The economic mechanism and predicted relationship (Section 1)
Data: Identifying, sourcing, and cleaning the data; QD approval gate (Section 2)
Methodology: Signal construction, model assumptions, parameter definitions (Section 3)
Backtest: Running the test, reporting results, and validating assumptions (Section 4)

QT Responsibility

Risk sizing: How large a position given your edge and the portfolio? How much capital?
Position limits: What's the maximum per-position limit? Sector concentration?
Portfolio fit: Does this strategy correlate with existing strategies? Does it diversify or concentrate risk?
Liquidity: Can we actually execute this at scale without moving the market?

QD Responsibility

Implementation: Writing the code to execute your signal
Data engineering: Ingesting data, building pipelines, ensuring data quality
Live execution: Orders, fills, market impact, infrastructure

Why the separation? QR documents what should be true. QD implements it. QT manages risk and ensures it fits the portfolio. Written IP only — no code handoff between QR and QD.

Signal vs. noise

The core statistical problem: most patterns are noise. How do you tell the difference?

Fisher's Fundamental Problem: If you run 100 tests, about 5 will produce p-values < 0.05 by chance alone, even if all your hypotheses are false. This is not a mistake — it's mathematics.

The solution is not to use a lower p-value threshold. The solution is to specify the hypothesis before running the test.

When you look at the data first and then form a hypothesis based on what you see, you are already looking at the result. The test cannot tell you whether this result is real or chance.

What does a real signal look like?

A signal that predicts returns should show up repeatedly across different time periods, subsets of the data, and slightly different model specifications. If your backtest result only appears with one specific set of parameters, on the exact data you optimized on, it's probably noise.

Math: The Sharpe Ratio

The Sharpe ratio is the standard metric for risk-adjusted return. But it assumes the excess returns are normally distributed, which is rarely true for real trading strategies.

\[ \text{Sharpe} = \frac{E[R_p] - R_f}{\sigma_p} \]

Where: E[R_p] = expected return of the portfolio, R_f = risk-free rate, σ_p = standard deviation (volatility) of the portfolio. Multiply by √252 to annualize from daily data.

The critical insight: A backtest Sharpe is almost always higher than live Sharpe. Why? Because the backtest uses historical data you already know the outcome of. You've implicitly optimized to that exact outcome. The Sharpe ratio you get in backtest is biased upward.

Chart: Three equity curves

What you're looking at: Three fictional strategies run on the same historical data. Blue is a true edge with consistent upward slope and controlled drawdowns. Orange looks even better — too smooth — likely overfitting. Green is pure noise — random walk with no edge. Watch how equity curves can deceive: the orange curve looks best but is the most dangerous.

Worked example: NasaPowerCouncil

A model hypothesis grounded in economic mechanism:

Economic mechanism: NASA POWER satellite data measures soil moisture and temperature at daily resolution for any location on Earth. These deviations from seasonal norms predict crop yields. Corn prices are partially driven by expected yields. USDA yield reports lag the growing season by weeks. If satellite anomalies predict yield surprises before USDA releases them, there is an information edge.

Predicted relationship: When growing-season soil moisture is abnormally high relative to the 20-year seasonal average (weighted by crop development stage), corn futures returns should be negative over the next 5–10 trading days (lower expected yields = lower prices). When abnormally dry, corn should appreciate.

Falsifiability: This hypothesis is falsified if (a) satellite GDD deviations have IC < 0.02 with forward corn returns, or (b) the backtest Sharpe ratio on hold-out 2023 data is < 0.5.

Note the difference: this is not "I tested satellite data and it predicts corn." That's result-first (data mining). This is "Here's the economic reason satellite data should predict corn. Here's the signal. Here's how I'll test it."

Common mistakes

Five ways to build noise instead of an edge

Writing the hypothesis after seeing the backtest. This guarantees overfitting. Specify the direction, threshold, and time horizon before you see the data.
Confusing correlation with causation. Just because two things move together doesn't mean one causes the other. The economic mechanism must explain why.
Claiming edge without explaining why it hasn't been arbitraged away. If the edge is obvious, where are the competitors? If they're not there, why not?
Thinking a high backtest Sharpe means you have an edge. Sharpe 3.0 in backtest suggests severe overfitting. Sharpe 1.0–1.5 in hold-out validation is what matters.
Scope creep into QT territory. Don't write about position sizing strategies or how large to make the bet. That's QT's problem. Write the signal and the hypothesis.

← Back to Hub Week 2: Derivatives & Mechanics →

What Is aResearch Edge?