Correlate scratchpad completion with run_id#9350
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
d5ebe35 to
85be69a
Compare
There was a problem hiding this comment.
Pull request overview
This PR improves scratchpad execution streaming by correlating “completion” with a specific scratchpad run via a run_id, so the SSE listener doesn’t terminate early and drop downstream reactive errors (fixing #9255 and enabling previously-xfail integration scenarios to pass).
Changes:
- Add optional
run_idcorrelation toExecuteScratchpadCommandandCompletedRunNotification, and plumb it through the HTTP execute endpoint + MCPexecute_code. - Update
ScratchCellListenerto treat a matchingCompletedRunNotification(run_id=...)as the completion sentinel (instead of scratch-cellidle), and ensure completion is always broadcast viafinally. - Standardize the terminal SSE
donepayload to{success, output}and update unit/integration tests + OpenAPI schema accordingly.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
marimo/_server/scratchpad.py |
Switch listener completion sentinel to CompletedRunNotification filtered by run_id; change done event shape to {success, output}. |
marimo/_runtime/runtime.py |
Always broadcast CompletedRunNotification(run_id=...) for scratchpad via try/finally to prevent listeners from blocking forever. |
marimo/_runtime/commands.py |
Add `run_id: str |
marimo/_messaging/notification.py |
Add optional run_id field to CompletedRunNotification. |
marimo/_server/api/endpoints/execution.py |
Mint a UUID run_id per /api/kernel/execute call and pass it to both the command and listener. |
marimo/_mcp/code_server/main.py |
Mint/pass run_id for MCP execute_code so it waits for the correct completion event. |
tests/_server/test_scratchpad.py |
Update listener construction (requires run_id) and update expectations for the new done payload shape + completion sentinel. |
tests/_server/test_scratchpad_integration.py |
Flip previously-xfail scenarios to passing and update SSE snapshots to the new behavior. |
packages/openapi/api.yaml |
Document/add run_id on CompletedRunNotification and runId on ExecuteScratchpad* schemas. |
packages/openapi/src/api.ts |
Regenerated TS types to include the new run_id / runId fields. |
marimo/_schemas/generated/notifications.yaml |
Regenerated notification schema reflecting CompletedRunNotification.run_id (and related schema updates). |
| ``success`` is false when the scratch cell itself errored OR any | ||
| downstream cell captured by the listener errored. The actual error | ||
| detail was already streamed via ``stderr`` events earlier in the | ||
| response — ``done`` carries only the success bit plus the scratch | ||
| cell's rendered output on success (empty on failure). |
There was a problem hiding this comment.
build_done_event no longer includes any error payload for failures and assumes the traceback/detail was already streamed via preceding stderr events. That assumption doesn't hold for scratchpad compile errors (e.g. MarimoSyntaxError from _try_compiling_cell), which are broadcast via CellNotificationUtils.broadcast_error as output=MARIMO_ERROR without emitting any stderr console output. With the current SSE shape, /api/kernel/execute callers may only see {success: false, output: {…}} with no error detail. Consider emitting a synthetic stderr SSE event when the scratch cell output channel is MARIMO_ERROR (or reintroducing a minimal error/errors field in the done payload) so failures always include actionable diagnostics.
| ``{success: false, output: {mimetype: "text/plain", data: ""}}``; | ||
| error detail arrives earlier in the stream as ``stderr`` SSE events. |
There was a problem hiding this comment.
The updated module docstring states that failure cases always deliver error detail earlier via stderr SSE events and that the terminal done event is uniformly {success, output}. There isn't an integration test covering a scratchpad compile-time failure (e.g. a SyntaxError / MarimoSyntaxError from _try_compiling_cell), which historically may not emit console stderr. Adding a scenario like session.execute("x =") would validate the new contract and guard against silent failures.
| ``{success: false, output: {mimetype: "text/plain", data: ""}}``; | |
| error detail arrives earlier in the stream as ``stderr`` SSE events. | |
| ``{success: false, output: {mimetype: "text/plain", data: ""}}``. | |
| When the kernel emits error detail before ``done``, these snapshots | |
| assert it through earlier SSE events such as ``stderr``. |
85be69a to
d5cf740
Compare
d5cf740 to
690916b
Compare
6165fd9 to
a01e152
Compare
Bundle ReportChanges will increase total bundle size by 129 bytes (0.0%) ⬆️. This is within the configured threshold ✅ Detailed changes
Affected Assets, Files, and Routes:view changes for bundle: marimo-esmAssets Changed:
|
a01e152 to
798274f
Compare
Fixes #9255. The ``ScratchCellListener`` used to fire its done sentinel on the scratch cell's ``idle`` status, relying on a 50ms grace for trailing output. Anything broadcast after that grace (most commonly an ``mo.state`` setter whose reactive descendants are flushed *after* the scratch runner returns, per ``Kernel.run_scratchpad``) was silently dropped from the SSE stream — ``/api/kernel/execute`` would return ``success: true`` while a downstream cell was in an exception state. ``ExecuteScratchpadCommand`` and ``CompletedRunNotification`` gain an optional ``run_id``. ``/api/kernel/execute`` and the MCP code server mint a UUID and pass it to both the command and the ``ScratchCellListener``; the listener now fires its sentinel only on the matching ``CompletedRunNotification``. Unrelated completions (from the ``session.instantiate`` call the endpoint makes first, or from concurrent browser activity) are ignored instead of tripping the listener early. ``handle_execute_scratchpad`` broadcasts its completion in a ``try/finally`` so a raising ``run_scratchpad`` can't leave the listener blocked indefinitely. The ``done`` SSE event is reshaped to a single ``{success, output}`` form. The ``error`` field is removed — the traceback is already in preceding ``stderr`` events, so duplicating it on ``done`` was redundant. On failure, ``output`` is ``{mimetype: "text/plain", data: ""}``. ``execute-code.sh`` is unaffected: it reads ``.output.data // empty`` for the success path and ``.success`` for the exit code. The four xfail integration tests from #9342 flip to passing.
Adds two integration tests that cover the `ctx.create_cell` validation surface: the default dry-run compile (raises early on multiply-defined names with the `skip_validation` hint) and the `skip_validation=True` bypass.
Two traceback-formatting differences slipped past `_normalize` and broke `test_ctx_create_cell_multiply_defined` on 3.10-3.12. First, the existing pointer regex required at least one `~`, so PEP 657 pure-caret underlines (`^^^^^^^^^^^^^^^^`, emitted by 3.11+ for expression spans) were never stripped; the new alternation matches those while keeping single-caret SyntaxError pointers intact (the classic ` ^`-style marker is present on every version, including 3.10, and the compile-error test asserts on it). Second, 3.13's collapsed-frames view of a multi-line `raise Foo(...)` keeps the trailing `)` on its own line after `_COLLAPSED_FRAMES_RE` elides the middle, whereas 3.10-3.12 don't show that line at all; the new `_RAISE_CLOSING_PAREN_RE` drops the lone closer so both worlds normalize to the same shape. The one affected snapshot is rewritten to match.
5ce5be8 to
4c927f2
Compare
`_MARIMO_SRC_RE` only matched Unix prefixes, and ran AFTER `_PATH_RE`. On Windows, `_PATH_RE` claimed the library path first and replaced it with `<tmp>`; `_INTERNAL_FRAME_RE` then stripped the now-anonymous frames, dropping the library traceback entirely from the snapshot. Match both separators, slash-normalize the captured tail, and run before `_PATH_RE` so Windows and Unix snapshots compare equal.
for more information, see https://pre-commit.ci
Closes #9302
Fixes #9255, and flips the 4 xfail integration tests from #9342 flip to passing.
The
ScratchCellListenerused to fire its done sentinel on the scratch cell'sidlestatus (+ 50ms grace for flushing stderr/stdout). Anything broadcast after that grace was silently dropped from the SSE stream.These changes add an optional
run_idtoExecuteScratchpadCommandandCompletedRunNotification. The api endpoint and MCP mint a UUID and pass it to both the command and theScratchCellListener; the listener now fires its sentinel only on the matchingCompletedRunNotification.Note: The reason we couldn't just observe
CompletedRunNotificationdirectly (#9302) is because theCompletedRunNotificationnfromsession.instantiatetrips up the listener early (separate run).Summary by cubic
Correlate each scratchpad run with a
run_idand wait for the matchingCompletedRunNotificationbefore finishing. This prevents early success in/api/kernel/execute, streams compile-time errors, and simplifies thedoneSSE.Bug Fixes
run_idtoExecuteScratchpadCommandandCompletedRunNotification;/api/kernel/executeand the MCP code server mint a UUID andScratchCellListenerwaits for the matching completion, ignoring others.CompletedRunNotificationin afinallyso listeners don’t hang ifrun_scratchpadraises.stderrand keep console output, soSyntaxErrordiagnostics are visible beforedone.ctx.create_cellvalidation and theskip_validation=Truepath to cover earlyRuntimeErrorreporting and graph-state failures.Migration
doneSSE now returns{ success, output }only; theerrorfield was removed. On failureoutputis{ mimetype: "text/plain", data: "" }.CompletedRunNotification.run_id?,ExecuteScratchpadCommand.runId?.Written for commit a01e152. Summary will update on new commits.