Skip to content

Commit d1c3688

Browse files
Copilotmichaelchu
andauthored
docs: add Architecture & Data Flow guide for strategy exploration sessions (#6)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: michaelchu <540510+michaelchu@users.noreply.github.com>
1 parent 669792b commit d1c3688

2 files changed

Lines changed: 267 additions & 4 deletions

File tree

README.md

Lines changed: 247 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,253 @@ Once connected via MCP:
102102
3. Screen: `evaluate_strategy({ strategy: "iron_condor", leg_deltas: [...], max_entry_dte: 45, exit_dte: 14, dte_interval: 7, delta_interval: 0.05, slippage: { type: "Mid" } })`
103103
4. Validate: `run_backtest({ strategy: "iron_condor", ..., capital: 100000, quantity: 1, max_positions: 5 })`
104104

105+
## Architecture & Data Flow
106+
107+
This section explains exactly how data moves through the system during a strategy exploration session.
108+
109+
### System Layers
110+
111+
```
112+
┌──────────────────────────────────────────────────────────────┐
113+
│ MCP Client (Claude Desktop, etc.) │
114+
│ sends JSON-RPC tool calls via stdio or HTTP │
115+
└───────────────────────────┬──────────────────────────────────┘
116+
117+
118+
┌──────────────────────────────────────────────────────────────┐
119+
│ OptopsyServer (server.rs) │
120+
│ routes tool calls · holds shared DataFrame in RwLock │
121+
└──────┬──────────┬────────────────┬───────────────┬───────────┘
122+
│ │ │ │
123+
load_data list_strategies evaluate_strategy run_backtest /
124+
(tools/) (tools/) (tools/) compare_strategies
125+
│ │ (tools/)
126+
▼ └───────┬───────┘
127+
┌─────────────┐ ▼
128+
│ data/ │ ┌──────────────────────────────┐
129+
│ cache.rs │ │ engine/core.rs │
130+
│ parquet.rs │ │ orchestrates the pipeline │
131+
└──────┬──────┘ └──┬───────────────────────────┘
132+
│ │
133+
local Parquet ┌─────┴────────────────────────────┐
134+
S3 fetch-on-miss │ strategies/ find_strategy() │
135+
│ engine/filters.rs │
136+
│ engine/evaluation.rs │
137+
│ engine/event_sim.rs │
138+
│ engine/pricing.rs │
139+
│ engine/metrics.rs │
140+
└─────┬────────────────────────────┘
141+
142+
143+
tools/ai_format.rs
144+
(enriches result with summary,
145+
key findings & suggested next steps)
146+
147+
148+
JSON response → MCP client
149+
```
150+
151+
### Step-by-Step: Strategy Exploration Session
152+
153+
#### Step 1 — Load Data (`load_data`)
154+
155+
```
156+
Client → load_data({ symbol: "SPY", start_date?, end_date? })
157+
→ CachedStore.load_options("SPY")
158+
→ check ~/.optopsy/cache/options/SPY.parquet
159+
→ if missing and S3 configured: download & cache locally
160+
→ parquet.rs reads Parquet and normalises the date column
161+
(accepts quote_date / data_date / quote_datetime as Date,
162+
Datetime, or String — all normalised to quote_datetime)
163+
→ optional date-range filter applied
164+
→ resulting DataFrame stored in server's shared Arc<RwLock<Option<DataFrame>>>
165+
→ returns LoadDataResponse: row count, symbols, date range,
166+
column list, suggested next steps
167+
```
168+
169+
#### Step 2 — Browse Strategies (`list_strategies`)
170+
171+
```
172+
Client → list_strategies()
173+
→ strategies::all_strategies() → Vec<StrategyDef>
174+
each StrategyDef: name, category, description
175+
each LegDef: side (Long/Short), option_type (Call/Put), qty
176+
→ grouped by category (singles, spreads, butterflies, condors,
177+
iron, calendars)
178+
→ returns StrategiesResponse with suggested next steps
179+
```
180+
181+
#### Step 3 — Statistical Screen (`evaluate_strategy`)
182+
183+
This path evaluates *historical* P&L across DTE × delta buckets — fast and data-driven, no capital simulation involved.
184+
185+
```
186+
Client → evaluate_strategy({ strategy, leg_deltas, max_entry_dte,
187+
exit_dte, dte_interval, delta_interval,
188+
slippage, commission? })
189+
190+
engine/core::evaluate_strategy(df, params):
191+
192+
1. strategies::find_strategy(name) → StrategyDef
193+
194+
2. Per leg (repeated for every leg in the strategy):
195+
a. filters::filter_option_type(df, "call"|"put")
196+
→ keep only rows matching this leg's option type
197+
b. filters::compute_dte(df)
198+
→ add dte = expiration − quote_datetime (integer days)
199+
c. filters::filter_dte_range(df, max_entry_dte, exit_dte)
200+
→ keep rows with exit_dte ≤ dte ≤ max_entry_dte
201+
d. filters::filter_valid_quotes(df)
202+
→ drop rows with zero bid or ask
203+
e. filters::select_closest_delta(df, target)
204+
→ group by (quote_datetime, expiration)
205+
→ pick the strike whose |delta| is closest to target,
206+
within [target.min, target.max]
207+
f. evaluation::match_entry_exit(entries, all_data, exit_dte)
208+
→ for each entry row, find the exit row with the same
209+
(expiration, strike, option_type) whose quote_datetime
210+
is closest to (expiration − exit_dte)
211+
→ returns joined DataFrame with entry & exit prices
212+
213+
3. Join all leg DataFrames on (quote_datetime, expiration)
214+
→ one row per trade opportunity that has all legs filled
215+
216+
4. rules::filter_strike_order(df, num_legs, strict)
217+
→ enforce ascending strike order across legs
218+
(skipped for straddles / iron butterflies)
219+
220+
5. pricing::leg_pnl(...) per row, per leg
221+
→ entry_price = mid | ask | liquidity-adjusted | fixed-per-leg
222+
(based on chosen Slippage model)
223+
→ exit_price = mid | bid | liquidity-adjusted | fixed-per-leg
224+
→ pnl = (exit_price − entry_price) × side × qty × multiplier
225+
→ commission subtracted (entry + exit)
226+
227+
6. output::bin_and_aggregate(df, dte_interval, delta_interval)
228+
→ create DTE buckets e.g. [30,37), [37,44) …
229+
→ create delta buckets e.g. [0.15,0.20), [0.20,0.25) …
230+
→ per bucket: mean, std, min, q25, median, q75, max,
231+
win_rate, profit_factor, count
232+
233+
→ ai_format::format_evaluate()
234+
→ identify best/worst bucket, highest win-rate bucket
235+
→ generate natural-language summary & suggested next steps
236+
→ returns EvaluateResponse with Vec<GroupStats>
237+
```
238+
239+
#### Step 4 — Full Simulation (`run_backtest`)
240+
241+
This path runs a realistic, capital-constrained, event-driven backtest.
242+
243+
```
244+
Client → run_backtest({ strategy, leg_deltas, max_entry_dte,
245+
exit_dte, slippage, commission?,
246+
stop_loss?, take_profit?, max_hold_days?,
247+
capital, quantity, multiplier?, max_positions,
248+
selector? })
249+
250+
engine/core::run_backtest(df, params):
251+
252+
1. strategies::find_strategy(name) → StrategyDef
253+
254+
2. event_sim::build_price_table(df)
255+
→ iterates every row of the DataFrame once
256+
→ builds HashMap<(date, expiration, strike, OptionType),
257+
QuoteSnapshot{bid, ask, delta}>
258+
→ also collects sorted Vec<NaiveDate> of all trading days
259+
260+
3. event_sim::find_entry_candidates(df, strategy_def, params)
261+
→ applies the same per-leg filter chain as evaluate_strategy
262+
(filter_option_type → compute_dte → filter_dte_range →
263+
filter_valid_quotes → select_closest_delta)
264+
→ joins legs, enforces strike order, computes net_premium
265+
→ returns Vec<EntryCandidate> (one per entry date × expiration)
266+
267+
4. event_sim::run_event_loop(price_table, candidates,
268+
trading_days, params, strategy_def)
269+
→ iterates day-by-day over trading_days:
270+
271+
OPEN PHASE:
272+
• find candidates with entry_date == today
273+
• skip if positions ≥ max_positions
274+
• apply TradeSelector (Nearest DTE, HighestPremium,
275+
LowestPremium, or First)
276+
• create Position from EntryCandidate; charge entry cost
277+
278+
CLOSE CHECK (for every open position):
279+
• look up today's price in PriceTable for each leg
280+
• compute current_value = Σ leg current prices × side × qty
281+
• check exit conditions in priority order:
282+
– DTE exit: dte ≤ exit_dte → ExitType::DteExit
283+
– Stop loss: loss > stop_loss × |entry_cost|
284+
→ ExitType::StopLoss
285+
– Take profit: gain > take_profit × |entry_cost|
286+
→ ExitType::TakeProfit
287+
– Max hold: days_held ≥ max_hold_days
288+
→ ExitType::MaxHold
289+
– Expiration: today ≥ expiration → ExitType::Expiration
290+
291+
EQUITY UPDATE (every day):
292+
• realized_pnl = sum of all closed trades
293+
• unrealized_pnl = Σ (current_value − entry_cost) for open positions
294+
• equity = capital + realized_pnl + unrealized_pnl
295+
• appended to equity_curve as EquityPoint{datetime, equity}
296+
297+
→ returns (Vec<TradeRecord>, Vec<EquityPoint>)
298+
299+
5. metrics::calculate_metrics(equity_curve, trade_log, capital)
300+
→ daily returns series from equity_curve
301+
→ Sharpe ratio (annualised, rf=0)
302+
→ Sortino ratio (downside deviation only)
303+
→ max drawdown (peak-to-trough)
304+
→ Calmar ratio (CAGR / max drawdown)
305+
→ VaR 95% (5th percentile of daily returns)
306+
→ CAGR (compound annual growth rate)
307+
→ win rate, profit factor
308+
→ avg P&L, avg winner, avg loser, avg days held
309+
→ max consecutive losses, expectancy
310+
311+
→ ai_format::format_backtest()
312+
→ trade summary (exit breakdown, best/worst trade)
313+
→ equity curve summary (start/end equity, peak, trough)
314+
→ sampled equity curve (≤50 points for compact transmission)
315+
→ natural-language assessment of Sharpe quality
316+
→ key findings & suggested next steps
317+
→ returns BacktestResponse
318+
```
319+
320+
#### Step 5 — Strategy Comparison (`compare_strategies`)
321+
322+
```
323+
Client → compare_strategies({ strategies: [CompareEntry, ...],
324+
sim_params })
325+
→ for each CompareEntry:
326+
→ assembles BacktestParams (entry params + shared sim_params)
327+
→ calls run_backtest() (full pipeline above)
328+
→ collects CompareResult: strategy, trades, pnl, sharpe,
329+
sortino, max_dd, win_rate, profit_factor, calmar,
330+
total_return_pct
331+
→ ai_format::format_compare()
332+
→ ranks strategies by Sharpe, then by total PnL
333+
→ identifies overall best performer
334+
→ returns CompareResponse with suggested next steps
335+
```
336+
337+
### Key Data Structures
338+
339+
| Structure | Where defined | Role |
340+
|-----------|---------------|------|
341+
| `DataFrame` (Polars) | `data/` | Raw options chain — column-oriented, immutable once loaded |
342+
| `StrategyDef` | `engine/types.rs` | Blueprint: name, category, legs, strike ordering flag |
343+
| `LegDef` | `engine/types.rs` | Per-leg config: side, option_type, delta target, qty |
344+
| `EntryCandidate` | `engine/types.rs` | Fully-matched option combo ready to open as a position |
345+
| `PriceTable` | `engine/types.rs` | `HashMap<(date, exp, strike, type) → QuoteSnapshot>` for O(1) daily lookup |
346+
| `Position` | `engine/types.rs` | Live position: legs, entry cost, status, quantity |
347+
| `TradeRecord` | `engine/types.rs` | Closed trade: entry/exit datetime, P&L, days held, exit reason |
348+
| `EquityPoint` | `engine/types.rs` | Daily equity snapshot (realized + unrealized) |
349+
| `GroupStats` | `engine/types.rs` | Aggregate stats for one DTE × delta bucket |
350+
| `PerformanceMetrics` | `engine/types.rs` | Portfolio-level risk/return metrics |
351+
105352
## Tech Stack
106353

107354
- [Polars](https://pola.rs/) — DataFrame engine

src/server.rs

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -232,10 +232,26 @@ impl ServerHandler for OptopsyServer {
232232
website_url: None,
233233
},
234234
instructions: Some(
235-
"Options backtesting engine. Load data first with load_data, \
236-
then use list_strategies to see available strategies, \
237-
evaluate_strategy for statistical analysis, \
238-
or run_backtest for simulation."
235+
"Options backtesting engine. \
236+
\n\nRecommended exploration workflow:\
237+
\n1. load_data({ symbol }) — load (or auto-fetch) a symbol's options chain. \
238+
All subsequent tools operate on the in-memory DataFrame loaded here.\
239+
\n2. list_strategies() — browse all built-in strategies grouped by category \
240+
(singles, spreads, butterflies, condors, iron, calendars).\
241+
\n3. evaluate_strategy({ strategy, leg_deltas, max_entry_dte, exit_dte, \
242+
dte_interval, delta_interval, slippage }) — fast statistical screen that \
243+
groups historical trades into DTE × delta buckets and returns mean P&L, \
244+
win rate, profit factor, and distribution stats per bucket. \
245+
Use this to identify promising parameter ranges before committing to a full simulation.\
246+
\n4. run_backtest({ strategy, leg_deltas, ..., capital, quantity, max_positions }) \
247+
— event-driven day-by-day simulation with position management (stop loss, take profit, \
248+
max hold, DTE exit), equity curve, and full performance metrics \
249+
(Sharpe, Sortino, Calmar, VaR, CAGR, expectancy).\
250+
\n5. compare_strategies({ strategies: [...], sim_params }) — run the same backtest \
251+
pipeline for multiple strategies in parallel and rank them by Sharpe and total P&L.\
252+
\n\nData flow summary: raw Parquet → DataFrame → per-leg filter/delta-select → \
253+
leg join → strike-order validation → P&L calculation → bucket aggregation \
254+
(evaluate) or event-loop simulation (backtest) → AI-enriched JSON response."
239255
.into(),
240256
),
241257
}

0 commit comments

Comments
 (0)