Trading System Development Lifecycle: Hypothesis to Deployment
A complete development lifecycle from hypothesis through testing, optimization, and live deployment with concrete gates and kill criteria at each stage.
As ferramentas interativas podem não funcionar na vista traduzida.
Trading System Development Lifecycle: Hypothesis to Deployment
Most failed systems share a cause: skipping stages. A system rushed from idea to live money skips the gates that filter out curve-fit garbage. The lifecycle below enforces those gates.
Stage 1: Hypothesis
State the edge in one sentence before touching data. "Trend-following on breakouts works because participants under-react to regime shifts." A hypothesis that cannot be stated plainly is not testable. Define the market, timeframe, entry trigger, exit trigger, and risk per trade on paper.
Stage 2: Backtest
Code the rules mechanically, no discretion. Run on at least 8-10 years of data across the target market. Record:
- Net profit, max drawdown, Sharpe, and profit factor.
- Trade count: below 100 trades is statistically weak; demand 200+.
- Distribution of returns, not just averages.
Gate: if backtest Sharpe is below 0.5 or max drawdown exceeds 30% of equity, kill the idea here.
Stage 3: Robustness
Before optimizing, stress-test the baseline:
- Parameter sensitivity: does Sharpe collapse if a parameter shifts 20%? If yes, the edge is fragile.
- Monte Carlo trade-order shuffle: recompute drawdown across 1,000 shuffled sequences. If the 95th percentile drawdown exceeds 2x the backtest drawdown, the system is order-dependent and risky.
- Out-of-sample test: reserve the most recent 20% of data, never used in development. If out-of-sample Sharpe drops more than 40% from in-sample, the system is overfit.
Gate: kill if robustness tests fail. Optimization cannot rescue a fragile baseline.
Stage 4: Optimization
Only now optimize parameters, and only across a narrow range. Use walk-forward analysis: optimize on a rolling in-sample window, test on the next out-of-sample window, roll forward. Require the walk-forward efficiency ratio (out-of-sample profit / in-sample profit) above 50%.
Stage 5: Forward Test (Paper)
Run the system live on a demo or small account for 30-60 trades. Compare live signals to backtest expectations. Track slippage and fill assumptions.
Gate: kill if live signal generation diverges materially from backtest, indicating lookahead or unrealistic fill logic.
Stage 6: Small Live
Deploy at 10-25% of intended risk. Run for 50-100 trades. Track live vs backtest Sharpe, win rate, and average R.
Gate: kill or pause if live results fall more than 1 standard deviation below backtest expectations over 50 trades.
Stage 7: Full Deployment and Monitoring
Scale to full size. Monitor rolling 50-trade Sharpe and drawdown. Define decay triggers in advance:
- Rolling Sharpe drops below 50% of backtest Sharpe for 2 consecutive months.
- Max drawdown exceeds backtest 95th-percentile Monte Carlo drawdown.
When a decay trigger fires, reduce size by 50% and investigate. Do not wait for a full drawdown to act.
The Discipline
Every stage has a kill criterion. Systems that pass all gates are rare; that is the point. The lifecycle exists to discard the 90% of ideas that do not survive honest testing, not to justify trading every idea you have.
Live Chart
Open full chart →Related market data, powered by TradingView.