feat: add orchestrated backtest pipeline (sweep -> walk-forward -> monte carlo) by michaelchu · Pull Request #177 · goldspanlabs/optopsy-mcp

michaelchu · 2026-04-06T03:45:59Z

When sweep_params are provided, the backtest tool now runs a full analysis
pipeline (pipeline=true): sweep -> significance gate ->
walk-forward validation -> OOS data gate -> monte carlo risk simulation.

Each stage reports a StageStatus (completed/skipped/failed) with reasons,
designed for frontend rendering of pass/fail gate cards. Users can opt out
with pipeline=false (default) to get just the sweep result.

Key changes:

Extract execute_from_returns() from monte_carlo.rs for in-process reuse
Add PipelineResponse and StageInfo types for frontend-renderable stages
Add pipeline.rs orchestrator with significance and OOS data gates
Add Pipeline variant to BacktestToolResponse enum
Add MIN_RETURNS_FOR_BOOTSTRAP constant (30) to constants.rs

https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

…nte carlo) When sweep_params are provided, the backtest tool now runs a full analysis pipeline by default (pipeline=true): sweep -> significance gate -> walk-forward validation -> OOS data gate -> monte carlo risk simulation. Each stage reports a StageStatus (completed/skipped/failed) with reasons, designed for frontend rendering of pass/fail gate cards. Users can opt out with pipeline=false to get just the sweep result. Key changes: - Extract execute_from_returns() from monte_carlo.rs for in-process reuse - Add PipelineResponse and StageInfo types for frontend-renderable stages - Add pipeline.rs orchestrator with significance and OOS data gates - Add Pipeline variant to BacktestToolResponse enum - Add MIN_RETURNS_FOR_BOOTSTRAP constant (30) to constants.rs https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

Copilot

Pull request overview

Adds an orchestrated “backtest pipeline” mode to the backtest tool so that, when sweeps are requested, the server can automatically chain follow-on analysis stages (walk-forward + Monte Carlo) and report per-stage statuses for frontend gate rendering.

Changes:

Introduces PipelineResponse / StageInfo / StageStatus response types and wires them into BacktestToolResponse as a new pipeline variant.
Adds a new src/tools/pipeline.rs orchestrator that runs sweep → significance gate → walk-forward → OOS data gate → Monte Carlo.
Refactors Monte Carlo to support execute_from_returns() for reuse by the pipeline; adds MIN_RETURNS_FOR_BOOTSTRAP constant.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/tools/response_types/pipeline.rs	New pipeline response schema for stage/gate rendering.
src/tools/response_types/mod.rs	Exposes the new pipeline response types.
src/tools/pipeline.rs	Implements sweep-following orchestration and gating logic.
src/tools/monte_carlo.rs	Extracts `execute_from_returns()` so Monte Carlo can run on derived returns.
src/tools/mod.rs	Registers the new `pipeline` tool module.
src/tools/backtest.rs	Adds `pipeline` param (default true for sweeps) and returns a new `pipeline` response variant.
src/server/mod.rs	Updates tool docs to mention the new pipeline behavior.
src/constants.rs	Adds `MIN_RETURNS_FOR_BOOTSTRAP` constant for Monte Carlo suitability gating.

Add POST /runs/pipeline (sync) and POST /tasks/pipeline (async) endpoints that mirror the MCP pipeline tool. Pipeline variant added to TaskKind for task manager tracking. https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

Add 4 integration tests covering: - Significance gate fails → downstream stages skipped - No permutation test → top combos pass significance gate - Full pipeline end-to-end with NVDA fixture data - Pipeline preserves sweep metadata (sweep_id, run_ids) Also thread script_source and base_params through tools::walk_forward::execute so the pipeline resolves strategy source from the DB store instead of requiring filesystem access. https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.

Walk-forward validation is always run as part of the backtest pipeline, not as a standalone MCP tool. REST endpoints (/walk-forward, /tasks/walk-forward) remain available for direct access. https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

- Add suggested_next_steps to PipelineResponse (consistent with other tools) - Fix OOS data gate off-by-one: gate on returns.len() not equity.len() - Fix has_permutation detection: use multiple_comparisons.is_some() - Fix JoinError: distinguish panic vs cancellation in error messages - Use MIN_RETURNS_FOR_BOOTSTRAP constant in monte_carlo.rs (was hardcoded 30) - Update StageStatus docs to clarify gate-failed vs execution-error semantics - Update backtest tool doc to mention oos_data_gate stage - Validate non-empty sweep_params in REST pipeline handlers (400 not 500) - Propagate DSL transpilation errors instead of swallowing them https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

- Update sweep-only suggested_next_steps to reference backtest(pipeline=true) instead of removed walk_forward tool - Use objective-aware metric in pipeline summary (not hardcoded Sharpe) - Uppercase mc_label for consistent symbol casing in MonteCarloResponse https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.

Previously, walk-forward base_params were derived from SweepResult.params (swept combo values only), which dropped non-swept params like CAPITAL, symbol, profiles, etc. Now run_pipeline accepts the original sweep base_params and passes them through to walk-forward. Test fixtures updated: SweepResult.params now only contains swept keys, with base params passed separately — matching real sweep behavior. https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

Pipeline is now opt-in: set pipeline=true to run the full validation chain (sweep -> walk-forward -> monte carlo). Default behavior is sweep-only, matching the previous behavior before the pipeline feature. https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

…est-pipeline-7nYFC # Conflicts: # src/server/handlers/mod.rs # src/server/router.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

michaelchu · 2026-04-06T15:21:58Z

@copilot apply changes based on the comments in this thread

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Copilot · 2026-04-06T15:58:39Z

+    // Stage 1: Sweep (already completed)
+    stages.push(StageInfo {
+        name: "sweep".to_string(),
+        status: StageStatus::Completed,
+        reason: None,
+        duration_ms: sweep_response.execution_time_ms,


stages["sweep"].duration_ms is populated from sweep_response.execution_time_ms, but when num_permutations > 0 the permutation gate work happens after the sweep completes (in execute_sweep_raw via spawn_blocking) and does not update execution_time_ms. This can materially under-report the sweep stage duration in pipeline UI when permutations are enabled. Consider returning the total sweep+permutation duration from execute_sweep_raw (or measuring elapsed time around sweep+permutation) and using that for the sweep stage duration, or splitting permutation testing into its own stage with its own duration.

Suggested change

// Stage 1: Sweep (already completed)

stages.push(StageInfo {

name: "sweep".to_string(),

status: StageStatus::Completed,

reason: None,

duration_ms: sweep_response.execution_time_ms,

let num_permutations = base_params

.get("num_permutations")

.and_then(Value::as_u64)

.unwrap_or(0);

let sweep_response_value = serde_json::to_value(&sweep_response).ok();

let total_sweep_duration_ms = sweep_response_value.as_ref().and_then(|value| {

value

.get("total_execution_time_ms")

.and_then(Value::as_u64)

.and_then(|duration_ms| duration_ms.try_into().ok())

.or_else(|| {

value

.get("permutation_execution_time_ms")

.and_then(Value::as_u64)

.and_then(|duration_ms| duration_ms.try_into().ok())

.map(|permutation_duration_ms| {

sweep_response

.execution_time_ms

.saturating_add(permutation_duration_ms)

})

})

});

let sweep_stage_duration_ms =

total_sweep_duration_ms.unwrap_or(sweep_response.execution_time_ms);

let sweep_stage_reason = if num_permutations > 0 && total_sweep_duration_ms.is_none() {

Some(

"Permutation testing was enabled, but the sweep response did not include total sweep+permutation timing; displaying the reported sweep execution time only."

.to_string(),

)

} else {

None

};

// Stage 1: Sweep (already completed)

stages.push(StageInfo {

name: "sweep".to_string(),

status: StageStatus::Completed,

reason: sweep_stage_reason,

duration_ms: sweep_stage_duration_ms,

Copilot AI review requested due to automatic review settings April 6, 2026 03:46

Copilot started reviewing on behalf of michaelchu April 6, 2026 03:46 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/tools/response_types/pipeline.rs

Comment thread src/tools/response_types/pipeline.rs Outdated

Comment thread src/tools/pipeline.rs Outdated

Comment thread src/tools/pipeline.rs

Comment thread src/tools/monte_carlo.rs

Comment thread src/server/mod.rs Outdated

claude added 2 commits April 6, 2026 04:06

feat: add REST API routes for backtest pipeline

8f0f777

Add POST /runs/pipeline (sync) and POST /tasks/pipeline (async) endpoints that mirror the MCP pipeline tool. Pipeline variant added to TaskKind for task manager tracking. https://claude.ai/code/session_01DEHwjSk7Y38DhefWeGZCLu

Copilot AI review requested due to automatic review settings April 6, 2026 05:20

Copilot started reviewing on behalf of michaelchu April 6, 2026 05:21 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/server/handlers/pipeline.rs

Comment thread src/server/handlers/tasks.rs

Comment thread src/tools/pipeline.rs Outdated

Comment thread src/tools/pipeline.rs Outdated

Comment thread src/tools/pipeline.rs Outdated

Comment thread src/tools/pipeline.rs

claude added 2 commits April 6, 2026 05:33

Copilot AI review requested due to automatic review settings April 6, 2026 05:49

Copilot started reviewing on behalf of michaelchu April 6, 2026 05:50 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/tools/backtest.rs

Comment thread src/tools/pipeline.rs

Comment thread src/tools/pipeline.rs Outdated

michaelchu requested a review from Copilot April 6, 2026 12:33

Copilot started reviewing on behalf of michaelchu April 6, 2026 12:34 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/tools/pipeline.rs Outdated

Comment thread tests/pipeline.rs Outdated

claude added 2 commits April 6, 2026 13:13

Copilot AI review requested due to automatic review settings April 6, 2026 14:40

Copilot started reviewing on behalf of michaelchu April 6, 2026 14:40 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/tools/backtest.rs

Comment thread src/tools/pipeline.rs Outdated

Comment thread src/tools/pipeline.rs Outdated

Comment thread src/tools/pipeline.rs

claude and others added 2 commits April 6, 2026 14:49

Merge remote-tracking branch 'origin/main' into claude/document-backt…

566a664

…est-pipeline-7nYFC # Conflicts: # src/server/handlers/mod.rs # src/server/router.rs

Update src/tools/pipeline.rs

71ee428

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 6, 2026 15:11

Copilot started reviewing on behalf of michaelchu April 6, 2026 15:11 View session

Update src/tools/pipeline.rs

96790b9

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot started work on behalf of michaelchu April 6, 2026 15:12 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/tools/backtest.rs

Comment thread src/server/handlers/pipeline.rs

Comment thread src/server/handlers/tasks.rs

Comment thread src/server/handlers/pipeline.rs Outdated

Update src/server/handlers/pipeline.rs

82fcbcc

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 6, 2026 15:19

Copilot started reviewing on behalf of michaelchu April 6, 2026 15:19 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Comment thread src/tools/pipeline.rs Outdated

Comment thread src/server/handlers/pipeline.rs

Comment thread src/server/handlers/tasks.rs Outdated

michaelchu added 4 commits April 6, 2026 11:46

Fix pipeline review feedback

33a507f

Fix pipeline clippy warning

0b628f7

Fix pipeline handler docs

c1907a0

Align pipeline defaults and docs

5ab626a

Copilot AI review requested due to automatic review settings April 6, 2026 15:52

Copilot started reviewing on behalf of michaelchu April 6, 2026 15:52 View session

michaelchu merged commit 2e199ef into main Apr 6, 2026
7 checks passed

michaelchu deleted the claude/document-backtest-pipeline-7nYFC branch April 6, 2026 15:56

Copilot AI reviewed Apr 6, 2026

View reviewed changes

-    // Stage 1: Sweep (already completed)
-    stages.push(StageInfo {
-        name: "sweep".to_string(),
-        status: StageStatus::Completed,
-        reason: None,
-        duration_ms: sweep_response.execution_time_ms,
+    let num_permutations = base_params
+        .get("num_permutations")
+        .and_then(Value::as_u64)
+        .unwrap_or(0);
+    let sweep_response_value = serde_json::to_value(&sweep_response).ok();
+    let total_sweep_duration_ms = sweep_response_value.as_ref().and_then(|value| {
+        value
+            .get("total_execution_time_ms")
+            .and_then(Value::as_u64)
+            .and_then(|duration_ms| duration_ms.try_into().ok())
+            .or_else(|| {
+                value
+                    .get("permutation_execution_time_ms")
+                    .and_then(Value::as_u64)
+                    .and_then(|duration_ms| duration_ms.try_into().ok())
+                    .map(|permutation_duration_ms| {
+                        sweep_response
+                            .execution_time_ms
+                            .saturating_add(permutation_duration_ms)
+                    })
+            })
+    });
+    let sweep_stage_duration_ms =
+        total_sweep_duration_ms.unwrap_or(sweep_response.execution_time_ms);
+    let sweep_stage_reason = if num_permutations > 0 && total_sweep_duration_ms.is_none() {
+        Some(
+            "Permutation testing was enabled, but the sweep response did not include total sweep+permutation timing; displaying the reported sweep execution time only."
+                .to_string(),
+        )
+    } else {
+        None
+    };
+    // Stage 1: Sweep (already completed)
+    stages.push(StageInfo {
+        name: "sweep".to_string(),
+        status: StageStatus::Completed,
+        reason: sweep_stage_reason,
+        duration_ms: sweep_stage_duration_ms,

Conversation

michaelchu commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michaelchu commented Apr 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michaelchu commented Apr 6, 2026 •

edited

Loading