Expectancy and System Evaluation Metrics
Evaluate trading systems beyond expectancy with MAR, Calmar, Sortino, and profit factor, learning threshold values and which metrics to combine for decisions.
翻訳ビューではインタラクティブツールが動作しない場合があります。
Expectancy and System Evaluation Metrics
Expectancy alone does not evaluate a system. Two systems with identical expectancy can have wildly different drawdowns, recovery profiles, and risk-adjusted returns. A complete evaluation combines expectancy with return-to-risk metrics that expose what expectancy hides.
Expectancy
Expectancy is the average R-multiple per trade:
Expectancy = (Win% × AvgWin) − (Loss% × AvgLoss)
A positive expectancy means the system makes money per trade over the long run, but says nothing about how rough the path is. A system with 0.2R expectancy and 40% win rate can draw down severely between winners.
Require expectancy above 0.15R for a tradable system; below that, costs and slippage erase the edge live.
Profit Factor
Profit Factor = Gross Profit / Gross Loss
- Below 1.0: losing system.
- 1.2-1.5: marginal, costs will hurt.
- 1.5-2.0: solid.
- Above 2.5: suspiciously good, suspect overfitting.
Profit factor above 2.0 with a small sample is a red flag, not a green one. Demand 200+ trades before trusting any number above 2.0.
Sharpe and Sortino
Sharpe = (Mean Return − Risk-Free Rate) / Std Dev of Returns
Sharpe penalizes upside volatility, which is not actually risk. Sortino fixes this using only downside deviation:
Sortino = (Mean Return − Risk-Free Rate) / Downside Std Dev
For asymmetric systems (trend-following with small losses, large winners), Sortino is the more honest measure. Thresholds:
- Sharpe below 0.5: weak.
- Sharpe 0.5-1.0: acceptable.
- Sharpe 1.0-1.5: strong.
- Sharpe above 2.0: question the backtest.
MAR and Calmar
Both measure return per unit of maximum drawdown:
MAR = Annualized Return / Max Drawdown
Calmar = Annualized Return / Max Drawdown (computed over 36 months)
- Below 0.5: poor risk-adjusted return.
- 0.5-1.0: reasonable.
- Above 1.0: strong.
- Above 2.0: rare and worth scrutinizing.
MAR and Calmar are the metrics that matter for capital allocation, because drawdown is what blows up accounts, not volatility.
Maximum Drawdown and Recovery
Report max drawdown alongside recovery time, the trades or days to reach a new equity high. A 20% drawdown recovered in 30 trades is tolerable; one taking 2 years traps capital. Track the drawdown distribution from Monte Carlo, not just the single backtest drawdown.
Combining Metrics for Decisions
No single metric decides; use a gate sequence:
- Expectancy > 0.15R and profit factor > 1.3 or reject.
- Sharpe > 0.6 (or Sortino > 0.8 for asymmetric systems).
- MAR > 0.5 and Monte Carlo 95th-percentile drawdown < tolerable threshold.
- Trade count > 200 and out-of-sample performance within 50% of in-sample.
A system passing all four gates is tradable. One failing any gate is suspect, regardless of how good another metric looks.
The Trap to Avoid
Optimizing for one metric in isolation is how systems get overfit; a system tuned to maximize Sharpe often parameter-fits to a low-volatility regime. Always evaluate the full metric set together on out-of-sample data, and treat any single exceptional number as a warning, not a selling point.
Live Chart
Open full chart →Related market data, powered by TradingView.