Skip to content

fix(agent): stream replay hardening#2771

Open
jordan-umusu wants to merge 5 commits into
feat/agent-interruptionsfrom
fix/agent-stream-replay
Open

fix(agent): stream replay hardening#2771
jordan-umusu wants to merge 5 commits into
feat/agent-interruptionsfrom
fix/agent-stream-replay

Conversation

@jordan-umusu

@jordan-umusu jordan-umusu commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Summary by cubic

Hardens chat stream replay and reconnects. Adds cheap status polling, Temporal-driven turn state, stable assistant bubble ids, composite SSE ids, and safe gap handling so clients reattach mid-turn without synthetic bubbles and observers see the user prompt.

  • New Features

    • Added GET /status returning turn_status/curr_run_id/prompt. Frontend useSessionStatus polls and useVercelChat pre-inserts the active user prompt via upsertActivePromptMessage, then attaches to the live turn.
    • Reconnect/resume: browser-owned Last-Event-ID (frontend scans id: lines from the SSE body and sets the header on reconnect); stable assistant bubble id session_id:run_id; Vercel adapter emits id: lines with composite frame ids (<redis-id>:<frame-index>), skips frames at resume_from, and omits message.start until a message_id is known.
    • Router/stream: passes message_id and resume_from; gap detection via Redis min_entry_id (stale cursor → replay from 0-0 if running, else emit a finishing stream); terminal reconnects preserve the bubble id from the latest stamped run; readers never write stream cursors and only expire buffers on completion; resume may include the cursor entry; waiting/idle without a cursor returns 204.
    • Live state and visibility: turn status now uses Temporal workflow phase over the DB projection; history rows stamped with curr_run_id; service hides active-run rows while running; adds get_active_run_prompt and get_latest_history_run_id.
  • Migration

    • Run Alembic: add curr_run_id to agent_session_history.

Written for commit cf6d67d. Summary will update on new commits.

Review in cubic

@jordan-umusu jordan-umusu changed the base branch from main to feat/agent-interruptions May 27, 2026 21:40
@jordan-umusu jordan-umusu changed the title Fix/agent stream replay fix(agent): stream replay hardening May 27, 2026
@jordan-umusu jordan-umusu force-pushed the feat/agent-interruptions branch from 611153a to d0d353d Compare May 27, 2026 21:43
@jordan-umusu jordan-umusu force-pushed the fix/agent-stream-replay branch from 2710c28 to 4bf4fb4 Compare May 27, 2026 21:44
@jordan-umusu jordan-umusu marked this pull request as ready for review May 27, 2026 21:45

Copy link
Copy Markdown
Collaborator Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@jordan-umusu jordan-umusu added ui Improvements or additions to UI/UX fix Bug fix priority:medium Medium priority ticket agents LLM agents labels May 27, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4bf4fb48b6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread frontend/src/hooks/use-chat.ts Outdated
@zeropath-ai

zeropath-ai Bot commented May 27, 2026

Copy link
Copy Markdown

No security or compliance issues detected. Reviewed everything up to cf6d67d.

Security Overview
Detected Code Changes
Change Type Relevant files
Enhancement ► alembic/versions/243a597b6a3a_add_curr_run_id_to_agent_session_history.py
    Add curr_run_id column to agent_session_history table and index it
► frontend/src/client/schemas.gen.ts
    Define AgentSessionStatusRead schema including curr_run_id and prompt
► frontend/src/client/services.gen.ts
    Add agentSessionsGetSessionStatus API endpoint
► frontend/src/client/types.gen.ts
    Define AgentSessionStatusRead type and AgentSessionsGetSessionStatusData/Response types
► frontend/src/components/chat/chat-session-pane.tsx
    Integrate useSessionStatus hook to get and display session status
► frontend/src/hooks/use-chat.ts
    Implement useSessionStatus hook for polling session status
    Add logic to upsert active prompt message for observer tabs
    Implement scanSseIds function to track SSE event IDs
    Add logic to attach to live turns based on session status
► packages/tracecat-ee/tracecat_ee/agent/workflows/durable.py
    Add get_turn_status query to DurableWorkflow
    Update status transitions to include RUNNING, FAILED, STOPPED, WAITING_FOR_APPROVAL, IDLE
► tests/unit/test_agent_executor_loopback.py
    Test that persisted session lines stamp curr_run_id
► tests/unit/test_agent_session_messages.py
    Test that list_messages hides active run rows when session is RUNNING
    Test that list_messages keeps rows when session is WAITING_FOR_APPROVAL
► tests/unit/test_agent_session_router.py
    Add get_session_status endpoint
    Implement logic in get_session_status to include prompt for active turns
    Add tests for streaming session events, including handling of different session statuses and Last-Event-ID header
Bug Fix ► frontend/src/hooks/use-chat.ts
    Fix resume logic to correctly attach to live turns using session status

@jordan-umusu jordan-umusu force-pushed the fix/agent-stream-replay branch from 4bf4fb4 to f357691 Compare May 27, 2026 21:51
@blacksmith-sh

This comment has been minimized.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f3576918a7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +567 to +568
start_id = requested
resume_from = last_event_id if cursor else None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve stream context when resuming mid-part

When a browser reconnects with a Last-Event-ID from a later Redis entry inside an already-open text/reasoning/tool part, starting the Redis read at requested rebuilds sse_vercel with a fresh VercelStreamContext that never saw the earlier start event. The next deltas for that part are then treated as unknown and emit nothing, so a transient disconnect can drop the rest of the assistant output until a new part starts. For Vercel streams, replay from the buffer start while suppressing frames up to the cursor, or otherwise restore/deterministically derive the part state before consuming future deltas.

Useful? React with 👍 / 👎.

@jordan-umusu jordan-umusu force-pushed the fix/agent-stream-replay branch from f357691 to 7a5dbe0 Compare May 27, 2026 21:57

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7a5dbe0472

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread frontend/src/components/chat/chat-session-pane.tsx

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found

Confidence score: 4/5

  • This PR looks safe to merge overall: the reported issue is low-to-moderate severity (4/10) with moderate confidence, suggesting limited blast radius rather than a likely blocker.
  • In tracecat/agent/session/router.py, synthesizing a Vercel messageId from the session UUID when no current run exists can cause a stale reconnect to show a new empty bubble instead of ending the stream cleanly.
  • Pay close attention to tracecat/agent/session/router.py - reconnect/message finalization logic should avoid generating synthetic IDs when there is no active run.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread tracecat/agent/session/router.py
@jordan-umusu jordan-umusu force-pushed the feat/agent-interruptions branch 2 times, most recently from 4377150 to d7054ea Compare May 28, 2026 19:30
@jordan-umusu jordan-umusu force-pushed the fix/agent-stream-replay branch from 7a5dbe0 to 6481d9b Compare May 28, 2026 19:30

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6481d9b4ef

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread frontend/src/hooks/use-chat.ts Outdated
@jordan-umusu jordan-umusu force-pushed the fix/agent-stream-replay branch from 6481d9b to 9c7dca6 Compare May 28, 2026 19:55
@jordan-umusu jordan-umusu force-pushed the fix/agent-stream-replay branch from 9c7dca6 to cf6d67d Compare May 28, 2026 21:21
@jordan-umusu jordan-umusu force-pushed the feat/agent-interruptions branch from 9c95128 to dba556b Compare May 28, 2026 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents LLM agents fix Bug fix priority:medium Medium priority ticket ui Improvements or additions to UI/UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant