Codex's Session is a monolithic "god object" that owns everything needed to run the agent in a single process: conversation state, model client, tool execution, MCP connections, auth, analytics, sandbox, file watching, network proxy, agent lifecycle, and more.
The Temporal harness only needs the state management portion. Session::new_minimal() constructs a full Session with most service fields stubbed to defaults/no-ops. This works and is the chosen approach — it avoids splitting Session into separate types, which would require moving ~20 methods, updating all callers, and touching 16+ files for marginal benefit.
Rather than extracting a SessionCore base type, we keep a single Session struct and use Session::new_minimal() to construct it for the Temporal workflow. The stub fields carry no meaningful runtime cost and the constructor is straightforward.
Session (used by both codex CLI and Temporal workflow)
├── conversation_id
├── state (Mutex<SessionState>) ← conversation history, token info
├── features
├── active_turn
├── event_sink (dyn EventSink) ← ChannelEventSink (codex) / BufferEventSink (Temporal)
├── storage (dyn StorageBackend) ← RolloutFileStorage (codex) / InMemoryStorage (Temporal)
├── next_internal_sub_id
├── tx_event
├── agent_status
├── pending_mcp_server_refresh_config
├── js_repl
└── services: SessionServices
├── model_client ← stubbed in Temporal (unused — activities do model calls)
├── mcp_connection_manager ← stubbed (default empty)
├── auth_manager ← stubbed (API key "harness")
├── otel_manager ← no-op (no telemetry in workflow)
├── ... (24 fields total) ← all stubbed to defaults/no-ops
└── show_raw_agent_reasoning ← false
event_sinkandstorageare injected via traits — Temporal provides its own implementations.EntropyProvidersare scoped per-turn via task-localENTROPY— Temporal injects deterministic implementations.ModelStreamerandToolCallHandlerare generic parameters ontry_run_sampling_request— Temporal provides its own activity-backed implementations.- The 24 stubbed service fields are never accessed in the Temporal workflow path. They're dead weight (~few KB) but cause no correctness issues.
We explored extracting a SessionCore struct (containing only the fields try_run_sampling_request needs) and having Session wrap it via Deref. This required:
- Moving ~20 methods from
impl Sessiontoimpl SessionCore - Updating every call site that accesses
services.otel_manager(16+ files, 40+ references) - Changing
try_run_sampling_requestsignature fromArc<Session>toArc<SessionCore> - Updating all plan-mode helpers,
HandleOutputCtx,drain_in_flight, etc.
The refactoring touched too many files for too little benefit. The single-Session approach with stubs is simpler, works today, and can be revisited if the stub overhead ever becomes a real problem.
Temporal decomposes Session's responsibilities across its own primitives. Session::new_minimal() runs in the workflow; SessionServices functionality is replaced by activities:
┌─────────────────────────────────────────────────────────┐
│ Temporal Workflow (deterministic) │
│ │
│ Session (via new_minimal) │
│ ├── conversation history (workflow state) │
│ ├── event sink (BufferEventSink) │
│ ├── storage (InMemoryStorage) │
│ ├── config (Config::for_harness) │
│ └── services (all stubbed) │
│ │
│ try_run_sampling_request(sess, streamer, handler) │
│ │ │ │ │
│ │ ┌──────┘ ┌──────┘ │
│ ▼ ▼ ▼ │
│ Session ModelStreamer ToolCallHandler │
│ (state + stubs) (trait) (trait) │
└───────────────────────┬───────────────┬─────────────────┘
│ │
┌─────▼─────┐ ┌─────▼─────┐
│ Activity │ │ Activity │
│model_call │ │ tool_exec │
│ │ │ │
│ModelClient│ │shell/patch│
│(codex) │ │/read/grep │
└───────────┘ └───────────┘
| Responsibility | Codex (single process) | Temporal |
|---|---|---|
| Conversation history | Session.state |
Session::new_minimal() in workflow |
| Prompt building | Session methods |
Same Session methods in workflow |
| Model calls | Session.services.model_client |
model_call activity (uses ModelClient) |
| Tool execution | Session.services (shell, sandbox) |
tool_exec activity (subprocess/file I/O) |
| Event delivery | Session.event_sink (channel) |
BufferEventSink + query polling |
| Persistence | Session.services.rollout (files) |
Temporal workflow history (built-in) |
| Fault tolerance | None (process dies = state lost) | Workflow replay (automatic) |
| Auth | Session.services.auth_manager |
Worker env var (OPENAI_API_KEY) |
| Entropy | System random + clock | Deterministic TemporalRandomSource / TemporalClock |
No SessionCore extraction needed. The following traits/abstractions are already in place:
EventSinktrait — pluggable event delivery (ChannelEventSink/BufferEventSink)StorageBackendtrait — pluggable rollout persistence (RolloutFileStorage/InMemoryStorage)ModelStreamertrait — pluggable model calls (generic ontry_run_sampling_request)ToolCallHandlertrait — pluggable tool execution (generic ontry_run_sampling_request)EntropyProviders— pluggable random/clock via task-localENTROPYAgentSessiontrait — pluggable agent backend for TUISession::new_minimal()— zero-service constructor for external harnessesConfig::for_harness()— zero-I/O config constructor
- Uses
Session::new_minimal()to hold conversation state - Approval decisions via
ExecApprovalRequestsignal +wait_condition
model_call— uses codexModelClient/ModelClientSessiondispatch_tool— builds aToolRouterfromToolsConfigand dispatches any tool call through the standard codex routing pipeline (ToolRouter::dispatch()), so all built-in tools (shell, apply_patch, read_file, list_dir, grep, etc.) are supported without per-tool handler code
MCP server connections are managed by HarnessMcpManager (in mcp.rs), which holds persistent RmcpClient connections to user-configured MCP servers. It supports tool discovery (returning qualified tool names like mcp__server__tool) and tool execution within activities. Elicitation requests from MCP servers are captured and surfaced to the workflow for approval.
Some codex features require long-lived processes that outlive individual activity calls: PTY sessions, JS REPL kernels. Activities are stateless one-shot functions — they can't hold a subprocess or connection open between invocations.
The planned solution is worker-level state: the worker process would own persistent resources (PTY sessions, JS REPL kernels, sandbox config), and activities would access them by reference. This would require sticky task routing — assigning each workflow to a specific worker — which is the natural model for a coding agent since the worker runs on the machine with the code.
This is not yet implemented. Currently, MCP connections are the only persistent resource, managed at the activity level by HarnessMcpManager.
See REFACTORING.md for known code quality improvements.