Correlate scratchpad completion with `run_id` by manzt · Pull Request #9350 · marimo-team/marimo

manzt · 2026-04-23T21:55:15Z

Closes #9302
Fixes #9255, and flips the 4 xfail integration tests from #9342 flip to passing.

The ScratchCellListener used to fire its done sentinel on the scratch cell's idle status (+ 50ms grace for flushing stderr/stdout). Anything broadcast after that grace was silently dropped from the SSE stream.

These changes add an optional run_id to ExecuteScratchpadCommand and CompletedRunNotification. The api endpoint and MCP mint a UUID and pass it to both the command and the ScratchCellListener; the listener now fires its sentinel only on the matching CompletedRunNotification.

Note: The reason we couldn't just observe CompletedRunNotification directly (#9302) is because the CompletedRunNotificationn from session.instantiate trips up the listener early (separate run).

Summary by cubic

Correlate each scratchpad run with a run_id and wait for the matching CompletedRunNotification before finishing. This prevents early success in /api/kernel/execute, streams compile-time errors, and simplifies the done SSE.

Bug Fixes
- Add optional run_id to ExecuteScratchpadCommand and CompletedRunNotification; /api/kernel/execute and the MCP code server mint a UUID and ScratchCellListener waits for the matching completion, ignoring others.
- Always emit CompletedRunNotification in a finally so listeners don’t hang if run_scratchpad raises.
- Stream compile-time errors to stderr and keep console output, so SyntaxError diagnostics are visible before done.
- Add integration tests for ctx.create_cell validation and the skip_validation=True path to cover early RuntimeError reporting and graph-state failures.
Migration
- done SSE now returns { success, output } only; the error field was removed. On failure output is { mimetype: "text/plain", data: "" }.
- OpenAPI and generated TS types updated: CompletedRunNotification.run_id?, ExecuteScratchpadCommand.runId?.

^{Written for commit a01e152. Summary will update on new commits.}

vercel · 2026-04-23T21:55:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	May 11, 2026 9:47pm

cubic-dev-ai

No issues found across 11 files

Copilot

Pull request overview

This PR improves scratchpad execution streaming by correlating “completion” with a specific scratchpad run via a run_id, so the SSE listener doesn’t terminate early and drop downstream reactive errors (fixing #9255 and enabling previously-xfail integration scenarios to pass).

Changes:

Add optional run_id correlation to ExecuteScratchpadCommand and CompletedRunNotification, and plumb it through the HTTP execute endpoint + MCP execute_code.
Update ScratchCellListener to treat a matching CompletedRunNotification(run_id=...) as the completion sentinel (instead of scratch-cell idle), and ensure completion is always broadcast via finally.
Standardize the terminal SSE done payload to {success, output} and update unit/integration tests + OpenAPI schema accordingly.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`marimo/_server/scratchpad.py`	Switch listener completion sentinel to `CompletedRunNotification` filtered by `run_id`; change `done` event shape to `{success, output}`.
`marimo/_runtime/runtime.py`	Always broadcast `CompletedRunNotification(run_id=...)` for scratchpad via `try/finally` to prevent listeners from blocking forever.
`marimo/_runtime/commands.py`	Add `run_id: str
`marimo/_messaging/notification.py`	Add optional `run_id` field to `CompletedRunNotification`.
`marimo/_server/api/endpoints/execution.py`	Mint a UUID `run_id` per `/api/kernel/execute` call and pass it to both the command and listener.
`marimo/_mcp/code_server/main.py`	Mint/pass `run_id` for MCP `execute_code` so it waits for the correct completion event.
`tests/_server/test_scratchpad.py`	Update listener construction (requires `run_id`) and update expectations for the new `done` payload shape + completion sentinel.
`tests/_server/test_scratchpad_integration.py`	Flip previously-xfail scenarios to passing and update SSE snapshots to the new behavior.
`packages/openapi/api.yaml`	Document/add `run_id` on `CompletedRunNotification` and `runId` on `ExecuteScratchpad*` schemas.
`packages/openapi/src/api.ts`	Regenerated TS types to include the new `run_id` / `runId` fields.
`marimo/_schemas/generated/notifications.yaml`	Regenerated notification schema reflecting `CompletedRunNotification.run_id` (and related schema updates).

Copilot · 2026-04-23T22:01:48Z

+    ``success`` is false when the scratch cell itself errored OR any
+    downstream cell captured by the listener errored. The actual error
+    detail was already streamed via ``stderr`` events earlier in the
+    response — ``done`` carries only the success bit plus the scratch
+    cell's rendered output on success (empty on failure).


build_done_event no longer includes any error payload for failures and assumes the traceback/detail was already streamed via preceding stderr events. That assumption doesn't hold for scratchpad compile errors (e.g. MarimoSyntaxError from _try_compiling_cell), which are broadcast via CellNotificationUtils.broadcast_error as output=MARIMO_ERROR without emitting any stderr console output. With the current SSE shape, /api/kernel/execute callers may only see {success: false, output: {…}} with no error detail. Consider emitting a synthetic stderr SSE event when the scratch cell output channel is MARIMO_ERROR (or reintroducing a minimal error/errors field in the done payload) so failures always include actionable diagnostics.

Copilot · 2026-04-23T22:01:48Z

+``{success: false, output: {mimetype: "text/plain", data: ""}}``;
+error detail arrives earlier in the stream as ``stderr`` SSE events.


The updated module docstring states that failure cases always deliver error detail earlier via stderr SSE events and that the terminal done event is uniformly {success, output}. There isn't an integration test covering a scratchpad compile-time failure (e.g. a SyntaxError / MarimoSyntaxError from _try_compiling_cell), which historically may not emit console stderr. Adding a scenario like session.execute("x =") would validate the new contract and guard against silent failures.

Suggested change

``{success: false, output: {mimetype: "text/plain", data: ""}}``;

error detail arrives earlier in the stream as ``stderr`` SSE events.

``{success: false, output: {mimetype: "text/plain", data: ""}}``.

When the kernel emits error detail before ``done``, these snapshots

assert it through earlier SSE events such as ``stderr``.

codecov · 2026-04-24T02:49:31Z

Bundle Report

Changes will increase total bundle size by 129 bytes (0.0%) ⬆️. This is within the configured threshold ✅

Detailed changes

Bundle name	Size	Change
marimo-esm	25.14MB	129 bytes (0.0%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: marimo-esm

Assets Changed:

Asset Name	Size Change	Total Size	Change (%)
`assets/terminal-*.js`	105 bytes	458.83kB	0.02%
`assets/config-*.js`	24 bytes	6.09kB	0.4%

Fixes #9255. The ``ScratchCellListener`` used to fire its done sentinel on the scratch cell's ``idle`` status, relying on a 50ms grace for trailing output. Anything broadcast after that grace (most commonly an ``mo.state`` setter whose reactive descendants are flushed *after* the scratch runner returns, per ``Kernel.run_scratchpad``) was silently dropped from the SSE stream — ``/api/kernel/execute`` would return ``success: true`` while a downstream cell was in an exception state. ``ExecuteScratchpadCommand`` and ``CompletedRunNotification`` gain an optional ``run_id``. ``/api/kernel/execute`` and the MCP code server mint a UUID and pass it to both the command and the ``ScratchCellListener``; the listener now fires its sentinel only on the matching ``CompletedRunNotification``. Unrelated completions (from the ``session.instantiate`` call the endpoint makes first, or from concurrent browser activity) are ignored instead of tripping the listener early. ``handle_execute_scratchpad`` broadcasts its completion in a ``try/finally`` so a raising ``run_scratchpad`` can't leave the listener blocked indefinitely. The ``done`` SSE event is reshaped to a single ``{success, output}`` form. The ``error`` field is removed — the traceback is already in preceding ``stderr`` events, so duplicating it on ``done`` was redundant. On failure, ``output`` is ``{mimetype: "text/plain", data: ""}``. ``execute-code.sh`` is unaffected: it reads ``.output.data // empty`` for the success path and ``.success`` for the exit code. The four xfail integration tests from #9342 flip to passing.

Adds two integration tests that cover the `ctx.create_cell` validation surface: the default dry-run compile (raises early on multiply-defined names with the `skip_validation` hint) and the `skip_validation=True` bypass.

Two traceback-formatting differences slipped past `_normalize` and broke `test_ctx_create_cell_multiply_defined` on 3.10-3.12. First, the existing pointer regex required at least one `~`, so PEP 657 pure-caret underlines (`^^^^^^^^^^^^^^^^`, emitted by 3.11+ for expression spans) were never stripped; the new alternation matches those while keeping single-caret SyntaxError pointers intact (the classic ` ^`-style marker is present on every version, including 3.10, and the compile-error test asserts on it). Second, 3.13's collapsed-frames view of a multi-line `raise Foo(...)` keeps the trailing `)` on its own line after `_COLLAPSED_FRAMES_RE` elides the middle, whereas 3.10-3.12 don't show that line at all; the new `_RAISE_CLOSING_PAREN_RE` drops the lone closer so both worlds normalize to the same shape. The one affected snapshot is rewritten to match.

`_MARIMO_SRC_RE` only matched Unix prefixes, and ran AFTER `_PATH_RE`. On Windows, `_PATH_RE` claimed the library path first and replaced it with `<tmp>`; `_INTERNAL_FRAME_RE` then stripped the now-anonymous frames, dropping the library traceback entirely from the snapshot. Match both separators, slash-normalize the captured tail, and run before `_PATH_RE` so Windows and Unix snapshots compare equal.

for more information, see https://pre-commit.ci

Copilot AI review requested due to automatic review settings April 23, 2026 21:55

github-actions Bot added the bash-focus Area to focus on during release bug bash label Apr 23, 2026

Copilot started reviewing on behalf of manzt April 23, 2026 21:55 View session

manzt added the enhancement New feature or request label Apr 23, 2026

cubic-dev-ai Bot reviewed Apr 23, 2026

View reviewed changes

manzt requested a review from mscolnick April 23, 2026 21:58

manzt force-pushed the fix/scratchpad-run-id-correlation branch from d5ebe35 to 85be69a Compare April 23, 2026 21:59

vercel Bot deployed to Preview April 23, 2026 22:00 View deployment

Copilot AI reviewed Apr 23, 2026

View reviewed changes

manzt mentioned this pull request Apr 23, 2026

fix(scratchpad): wait for CompletedRunNotification to capture downstream reactive errors #9302

Closed

manzt force-pushed the fix/scratchpad-run-id-correlation branch from 85be69a to d5cf740 Compare April 23, 2026 22:08

vercel Bot deployed to Preview April 23, 2026 22:09 View deployment

manzt force-pushed the fix/scratchpad-run-id-correlation branch from d5cf740 to 690916b Compare April 23, 2026 22:19

vercel Bot deployed to Preview April 23, 2026 22:20 View deployment

manzt force-pushed the fix/scratchpad-run-id-correlation branch from 6165fd9 to a01e152 Compare April 23, 2026 22:40

manzt commented Apr 23, 2026

View reviewed changes

Comment thread tests/_server/test_scratchpad_integration.py

vercel Bot deployed to Preview April 23, 2026 22:42 View deployment

mscolnick previously approved these changes Apr 27, 2026

View reviewed changes

manzt force-pushed the fix/scratchpad-run-id-correlation branch from a01e152 to 798274f Compare May 11, 2026 17:56

vercel Bot deployed to Preview May 11, 2026 17:57 View deployment

manzt dismissed mscolnick’s stale review via 5ce5be8 May 11, 2026 18:35

vercel Bot deployed to Preview May 11, 2026 18:36 View deployment

manzt added 3 commits May 11, 2026 17:16

Test code_mode create_cell validation paths

a5a12a8

Adds two integration tests that cover the `ctx.create_cell` validation surface: the default dry-run compile (raises early on multiply-defined names with the `skip_validation` hint) and the `skip_validation=True` bypass.

manzt force-pushed the fix/scratchpad-run-id-correlation branch from 5ce5be8 to 4c927f2 Compare May 11, 2026 21:16

vercel Bot deployed to Preview May 11, 2026 21:17 View deployment

[pre-commit.ci] auto fixes from pre-commit.com hooks

4b7a6af

for more information, see https://pre-commit.ci

vercel Bot deployed to Preview May 11, 2026 21:47 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correlate scratchpad completion with `run_id`#9350

Correlate scratchpad completion with `run_id`#9350
manzt wants to merge 5 commits into
mainfrom
fix/scratchpad-run-id-correlation

manzt commented Apr 23, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

vercel Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 23, 2026

Uh oh!

Copilot AI Apr 23, 2026

Uh oh!

Uh oh!

codecov Bot commented Apr 24, 2026 •

edited

Loading

Assets Changed:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		``{success: false, output: {mimetype: "text/plain", data: ""}}``;
		error detail arrives earlier in the stream as ``stderr`` SSE events.

-``{success: false, output: {mimetype: "text/plain", data: ""}}``;
-error detail arrives earlier in the stream as ``stderr`` SSE events.
+``{success: false, output: {mimetype: "text/plain", data: ""}}``.
+When the kernel emits error detail before ``done``, these snapshots
+assert it through earlier SSE events such as ``stderr``.

Conversation

manzt commented Apr 23, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

vercel Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bundle Report

Affected Assets, Files, and Routes:

Assets Changed:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

manzt commented Apr 23, 2026 •

edited by cubic-dev-ai Bot

Loading

vercel Bot commented Apr 23, 2026 •

edited

Loading

codecov Bot commented Apr 24, 2026 •

edited

Loading