Progressive Validator
Stop wasting 3 hours on a backtest that was doomed from the start. This skill implements a multi-stage validation pipeline that eliminates bad strategies in 15 minutes instead of 3 hours.
When to use
- - "Validate this strategy"
- "Run the progressive test pipeline"
- "Is this strategy worth a full backtest?"
- When planning the validation sequence for a new strategy variant
The Pipeline
CODEBLOCK0
Without progressive validation: Every failed strategy costs 3 hours.
With progressive validation: Most failures caught in 15-45 minutes.
Validation Stages
Stage 1: Smoke Test
- - Period: 2024-01-01 to 2024-03-31 (3 months)
- Time: ~15-20 minutes
- Threshold: Drawdown < 50%
- Purpose: Catch compilation errors, logic bugs, and catastrophic structural flaws
- What it covers: Q1 2024 (includes major tech rallies)
Stage 2: Stress Test
- - Period: 2024-02-01 to 2024-06-30 (5 months)
- Time: ~25-30 minutes
- Threshold: Drawdown < 45%
- Purpose: Test survival during the hardest market conditions
- What it covers: 2024 H1 — historically the worst "meat grinder" period for options strategies
Stage 3: Medium
- - Period: 2024-01-01 to 2025-06-30 (18 months)
- Time: ~45-60 minutes
- Threshold: Drawdown < 42%
- Purpose: Validate across bull/bear transitions and seasonal effects
- What it covers: Full 2024 volatility + 2025 early recovery
Stage 4: Full Period
- - Period: 2023-01-01 to 2026-01-31 (3 years)
- Time: ~2-3 hours
- Threshold: Drawdown < 40%, Sharpe >= 2.0, Profit >= 300%
- Purpose: Final acceptance test — benchmark against proven strategies
- What it covers: Complete market cycle including 2023 AI rally, 2024 correction, 2025 recovery
Usage
Configure windows
Define your validation windows in config:
CODEBLOCK1
Run each stage
Prerequisite: This skill coordinates validation stages. Actual backtest submission
is handled by the backtest-poller skill (cli.py). Ensure that skill is installed
and available on your path before running these commands.
CODEBLOCK2
Skip Rules
Not every change needs to start from Smoke:
| Change Type | Start From |
|---|
| Entry logic changed | Smoke (Stage 1) |
| Structural change (position sizing, survival) |
Smoke (Stage 1) |
| Profit management only | Medium (Stage 3) |
| Date/parameter tweak | Same stage as before |
Early-Stop Integration
This skill works alongside the backtest-poller skill (a separate package). The
backtest-poller's early-stop feature monitors drawdown in real time and deletes the
backtest run if the threshold is exceeded after 20% progress — no need to wait for
full completion of a doomed run. This validator tracks which stages passed or failed
locally, so you always know where to resume.
Dependency: Install the backtest-poller skill to enable submit/early-stop
functionality. This validator does not submit backtests itself.
Time Savings Example
Testing 5 strategy variants, 3 of which are bad:
| Approach | Time |
|---|
| Full backtest only | 5 x 3h = 15 hours |
| Progressive validation |
3 x 15min + 1 x 45min + 1 x 3h =
~4.5 hours |
Savings: ~70% of compute time.
Rules
- - Never skip stages without justification. The skip rules table above defines the only valid exceptions. If entry logic or survival structure changed, you must start from Smoke.
- A strategy must pass a stage before advancing. Do not promote a strategy to the next stage if the current stage resulted in early-stop or failure.
- Do not modify stage thresholds mid-validation. Changing
max_dd between stages invalidates the progressive guarantee. Decide thresholds before starting. - One strategy variant per validation run. Do not change the strategy code between stages — the point is to validate the same code across increasingly demanding windows.
- Record every result, even failures. Use
python3 validator.py record <strategy> <stage> --status passed|failed to persist outcomes. Unrecorded results break the next and status commands.
渐进式验证器
别再浪费3小时去做一个从一开始就注定失败的回测了。该技能实现了一个多阶段验证流程,能在15分钟内淘汰糟糕的策略,而不是3小时。
使用时机
- - 验证这个策略
- 运行渐进式测试流程
- 这个策略值得做完整回测吗?
- 当规划新策略变体的验证顺序时
验证流程
15分钟 30分钟 1小时 3小时
+-----------+ +-----------+ +-----------+ +-----------+
| 快速测试 | 通过 | 压力测试 | 通过 | 中等测试 | 通过 | 完整测试 |
| 3个月 |------>| 5个月 |------>| 18个月 |------>| 3年 |
| 回撤<50% | | 回撤<45% | | 回撤<42% | | 回撤<40% |
+-----------+ +-----------+ +-----------+ +-----------+
| 失败 | 失败 | 失败 | 失败
v v v v
拒绝 拒绝 拒绝 拒绝
(损失15分钟) (损失45分钟) (损失1.5小时) (损失3小时以上)
没有渐进式验证:每个失败策略都耗费3小时。
使用渐进式验证:大多数失败在15-45分钟内被发现。
验证阶段
阶段1:快速测试
- - 周期:2024-01-01 至 2024-03-31(3个月)
- 时间:约15-20分钟
- 阈值:回撤 < 50%
- 目的:捕获编译错误、逻辑缺陷和灾难性结构缺陷
- 覆盖范围:2024年第一季度(包含主要科技股上涨行情)
阶段2:压力测试
- - 周期:2024-02-01 至 2024-06-30(5个月)
- 时间:约25-30分钟
- 阈值:回撤 < 45%
- 目的:测试在最艰难市场条件下的生存能力
- 覆盖范围:2024年上半年——历史上期权策略最糟糕的绞肉机时期
阶段3:中等测试
- - 周期:2024-01-01 至 2025-06-30(18个月)
- 时间:约45-60分钟
- 阈值:回撤 < 42%
- 目的:验证在牛熊转换和季节性效应下的表现
- 覆盖范围:2024年全年波动 + 2025年早期复苏
阶段4:完整周期
- - 周期:2023-01-01 至 2026-01-31(3年)
- 时间:约2-3小时
- 阈值:回撤 < 40%,夏普比率 >= 2.0,利润 >= 300%
- 目的:最终验收测试——与已验证策略进行基准对比
- 覆盖范围:完整市场周期,包括2023年AI行情、2024年回调、2025年复苏
使用方法
配置窗口
在配置中定义验证窗口:
python
BACKTEST_WINDOWS = {
smoke_test: {
start: 2024-01-01,
end: 2024-03-31,
max_dd: 0.50,
expected_time: 15-20 min,
purpose: 快速淘汰垃圾策略,
},
stress_test: {
start: 2024-02-01,
end: 2024-06-30,
max_dd: 0.45,
expected_time: 25-30 min,
purpose: 在最恶劣条件下生存,
},
medium: {
start: 2024-01-01,
end: 2025-06-30,
max_dd: 0.42,
expected_time: 45-60 min,
purpose: 牛熊转换稳定性,
},
full: {
start: 2023-01-01,
end: 2026-01-31,
max_dd: 0.40,
expected_time: 2-3 hours,
purpose: 最终基准验收,
},
}
运行每个阶段
前置条件:该技能协调验证阶段。实际回测提交由 backtest-poller 技能(cli.py)处理。运行这些命令前,请确保该技能已安装并在路径中可用。
bash
阶段1:快速测试
(使用 backtest-poller 技能的 cli.py)
python3 ../backtest-poller/cli.py submit \
--main-file strategy.py --name M31_smoke
检查下一步运行什么:
python3 validator.py next M31 strategy.py
快速测试完成后记录结果:
python3 validator.py record M31 smoke_test --status passed --drawdown 0.32 --sharpe 2.1
阶段2:压力测试(仅当快速测试通过时)
python3 ../backtest-poller/cli.py submit \
--main-file strategy.py --name M31_stress
python3 validator.py record M31 stress_test --status passed --drawdown 0.38 --sharpe 2.0
阶段3:中等测试(仅当压力测试通过时)
python3 ../backtest-poller/cli.py submit \
--main-file strategy.py --name M31_medium
python3 validator.py record M31 medium --status passed --drawdown 0.35 --sharpe 2.3
阶段4:完整测试(仅当中等测试通过时)
python3 ../backtest-poller/cli.py submit \
--main-file strategy.py --name M31_full
python3 validator.py record M31 full --status passed --drawdown 0.30 --sharpe 2.5 --profit 3.2
跳过规则
并非每次修改都需要从快速测试开始:
| 修改类型 | 起始阶段 |
|---|
| 入场逻辑变更 | 快速测试(阶段1) |
| 结构性变更(仓位管理、生存机制) |
快速测试(阶段1) |
| 仅利润管理 | 中等测试(阶段3) |
| 日期/参数微调 | 与之前相同的阶段 |
提前停止集成
该技能与 backtest-poller 技能(独立包)协同工作。backtest-poller 的提前停止功能实时监控回撤,当进度达到20%后若超过阈值则删除回测运行——无需等待注定失败的运行完全结束。该验证器在本地跟踪各阶段通过或失败状态,因此您始终知道从何处继续。
依赖关系:安装 backtest-poller 技能以启用提交/提前停止功能。该验证器本身不提交回测。
时间节省示例
测试5个策略变体,其中3个是糟糕的:
| 方法 | 时间 |
|---|
| 仅完整回测 | 5 x 3小时 = 15小时 |
| 渐进式验证 |
3 x 15分钟 + 1 x 45分钟 + 1 x 3小时 =
约4.5小时 |
节省:约70% 的计算时间。
规则
- - 未经合理说明,切勿跳过阶段。 上方的跳过规则表定义了唯一有效的例外情况。如果入场逻辑或生存结构发生变更,必须从快速测试开始。
- 策略必须通过当前阶段才能进入下一阶段。 如果当前阶段导致提前停止或失败,不要将策略提升到下一阶段。
- 验证过程中不要修改阶段阈值。 在阶段之间更改 max_dd 会破坏渐进式保证。在开始前确定阈值。
- 每次验证运行只测试一个策略变体。 不要在阶段之间更改策略代码——关键在于用同一份代码在越来越严苛的时间窗口中进行验证。
- 记录每个结果,包括失败。使用 python3 validator.py record <策略名称> <阶段> --status passed|failed 持久化结果。未记录的结果会破坏 next 和 status 命令的功能。