Trading System Development Lifecycle: Hypothesis to Deployment
A complete development lifecycle from hypothesis through testing, optimization, and live deployment with concrete gates and kill criteria at each stage.
번역 보기에서는 대화형 도구가 작동하지 않을 수 있습니다.
Trading System Development Lifecycle: Hypothesis to Deployment
Most failed systems share a cause: skipping stages. A system rushed from idea to live money skips the gates that filter out curve-fit garbage. The lifecycle below enforces those gates.
Stage 1: Hypothesis
State the edge in one sentence before touching data. "Trend-following on breakouts works because participants under-react to regime shifts." A hypothesis that cannot be stated plainly is not testable. Define the market, timeframe, entry trigger, exit trigger, and risk per trade on paper.
Stage 2: Backtest
Code the rules mechanically, no discretion. Run on at least 8-10 years of data across the target market. Record:
- Net profit, max drawdown, Sharpe, and profit factor.
- Trade count: below 100 trades is statistically weak; demand 200+.
- Distribution of returns, not just averages.
Gate: if backtest Sharpe is below 0.5 or max drawdown exceeds 30% of equity, kill the idea here.
Stage 3: Robustness
Before optimizing, stress-test the baseline:
- Parameter sensitivity: does Sharpe collapse if a parameter shifts 20%? If yes, the edge is fragile.
- Monte Carlo trade-order shuffle: recompute drawdown across 1,000 shuffled sequences. If the 95th percentile drawdown exceeds 2x the backtest drawdown, the system is order-dependent and risky.
- Out-of-sample test: reserve the most recent 20% of data, never used in development. If out-of-sample Sharpe drops more than 40% from in-sample, the system is overfit.
Gate: kill if robustness tests fail. Optimization cannot rescue a fragile baseline.
Stage 4: Optimization
Only now optimize parameters, and only across a narrow range. Use walk-forward analysis: optimize on a rolling in-sample window, test on the next out-of-sample window, roll forward. Require the walk-forward efficiency ratio (out-of-sample profit / in-sample profit) above 50%.
Stage 5: Forward Test (Paper)
Run the system live on a demo or small account for 30-60 trades. Compare live signals to backtest expectations. Track slippage and fill assumptions.
Gate: kill if live signal generation diverges materially from backtest, indicating lookahead or unrealistic fill logic.
Stage 6: Small Live
Deploy at 10-25% of intended risk. Run for 50-100 trades. Track live vs backtest Sharpe, win rate, and average R.
Gate: kill or pause if live results fall more than 1 standard deviation below backtest expectations over 50 trades.
Stage 7: Full Deployment and Monitoring
Scale to full size. Monitor rolling 50-trade Sharpe and drawdown. Define decay triggers in advance:
- Rolling Sharpe drops below 50% of backtest Sharpe for 2 consecutive months.
- Max drawdown exceeds backtest 95th-percentile Monte Carlo drawdown.
When a decay trigger fires, reduce size by 50% and investigate. Do not wait for a full drawdown to act.
The Discipline
Every stage has a kill criterion. Systems that pass all gates are rare; that is the point. The lifecycle exists to discard the 90% of ideas that do not survive honest testing, not to justify trading every idea you have.
Live Chart
Open full chart →Related market data, powered by TradingView.