delete workflow in batches#256
Conversation
commit: |
📝 WalkthroughWalkthroughThis PR refactors workflow cleanup to be resilient for large workflows by implementing batched, transaction-budget-aware deletion. The Sequence DiagramsequenceDiagram
participant cleanup as cleanup()
participant deleteStepsFrom as deleteStepsFrom()
participant deleteStepBatch as deleteStepBatch()
participant budget as transactionBudgetMostlyConsumed()
participant cleanupContinue as cleanupContinue()
cleanup->>deleteStepsFrom: deleteStepsFrom(ctx, workflowId, 0)
loop Process batches
deleteStepsFrom->>deleteStepBatch: fetch and delete next batch
deleteStepBatch->>budget: check transaction budget
alt Budget OK
deleteStepBatch->>deleteStepBatch: delete steps, delete events, enqueue nested cleanup batch
deleteStepBatch-->>deleteStepsFrom: continue to next batch
else Budget exceeded
deleteStepBatch->>deleteStepsFrom: return offset for resumption
deleteStepsFrom->>cleanupContinue: schedule cleanupContinue(ctx, workflowId, offset)
Note over cleanupContinue: Scheduled for next execution
cleanupContinue->>deleteStepsFrom: resume deleteStepsFrom(ctx, workflowId, offset)
end
end
deleteStepsFrom-->>cleanup: all steps deleted
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/component/workflow.test.ts (1)
574-577: ⚡ Quick winIncrease test size to exercise batching behavior.
Using
stepCount = 100won’t cross the 256-delete batch boundary, so this test can pass without covering the new multi-batch cleanup path.Simple update
- const stepCount = 100; + const stepCount = 300;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/component/workflow.test.ts` around lines 574 - 577, The test uses stepCount = 100 which doesn't trigger the multi-batch delete path; increase the stepCount value to a number greater than the delete batch size (e.g., set stepCount to 300) so the loop that calls ctx.db.insert("steps", ...) within the t.run(async (ctx) => { ... }) block creates enough records to exercise the 256-delete batching behavior and validate the multi-batch cleanup logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/component/workflow.ts`:
- Line 314: deleteStepsFrom currently schedules cleanup via cleanupContinue and
can return before deletions finish, allowing restartHandler to increment
generation and restart while old steps remain; change deleteStepsFrom (or add a
new variant like deleteStepsFromAndWait) so it returns a Promise that resolves
only after cleanupContinue and all targeted step deletions complete, then update
restartHandler (and the other call sites using deleteStepsFrom) to await that
completed promise before incrementing the workflow generation and triggering the
restart to prevent stale steps from persisting.
---
Nitpick comments:
In `@src/component/workflow.test.ts`:
- Around line 574-577: The test uses stepCount = 100 which doesn't trigger the
multi-batch delete path; increase the stepCount value to a number greater than
the delete batch size (e.g., set stepCount to 300) so the loop that calls
ctx.db.insert("steps", ...) within the t.run(async (ctx) => { ... }) block
creates enough records to exercise the 256-delete batching behavior and validate
the multi-batch cleanup logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: get-convex/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 33725fd3-2e6d-4b66-9cfe-67a6f6b02e2b
📒 Files selected for processing (3)
CHANGELOG.mdsrc/component/workflow.test.tssrc/component/workflow.ts
| fromStepNumber = found; | ||
| prefetched = visited; | ||
| } | ||
| await deleteStepsFrom(ctx, args.workflowId, fromStepNumber, prefetched); |
There was a problem hiding this comment.
Prevent deferred step deletion during restart.
deleteStepsFrom can schedule cleanupContinue and return before all target steps are deleted. restartHandler then increments generation and restarts immediately, which can leave stale steps from the old run and corrupt replay behavior.
Suggested direction
-export async function restartHandler(...) {
+export async function restartHandler(...) {
...
- await deleteStepsFrom(ctx, args.workflowId, fromStepNumber, prefetched);
+ await deleteStepsFrom(ctx, args.workflowId, fromStepNumber, prefetched, {
+ allowContinuation: false,
+ });
...
}
async function deleteStepsFrom(
ctx: MutationCtx,
workflowId: Id<"workflows">,
fromStepNumber: number,
prefetched?: Doc<"steps">[],
+ opts: { allowContinuation?: boolean } = {},
) {
+ const allowContinuation = opts.allowContinuation ?? true;
...
- if (await transactionBudgetMostlyConsumed(ctx)) {
+ if (allowContinuation && (await transactionBudgetMostlyConsumed(ctx))) {
await ctx.scheduler.runAfter(0, internal.workflow.cleanupContinue, {
workflowId,
fromStepNumber,
});
return;
}
...
- if (await transactionBudgetMostlyConsumed(ctx)) {
+ if (allowContinuation && (await transactionBudgetMostlyConsumed(ctx))) {
await ctx.scheduler.runAfter(0, internal.workflow.cleanupContinue, {
workflowId,
fromStepNumber: batch[batch.length - 1].stepNumber + 1,
});
return;
}Also applies to: 546-551, 565-570
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/component/workflow.ts` at line 314, deleteStepsFrom currently schedules
cleanup via cleanupContinue and can return before deletions finish, allowing
restartHandler to increment generation and restart while old steps remain;
change deleteStepsFrom (or add a new variant like deleteStepsFromAndWait) so it
returns a Promise that resolves only after cleanupContinue and all targeted step
deletions complete, then update restartHandler (and the other call sites using
deleteStepsFrom) to await that completed promise before incrementing the
workflow generation and triggering the restart to prevent stale steps from
persisting.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/component/workflow.ts (1)
316-316:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftKeep
restarton a synchronous deletion path.
deleteStepsFromcan enqueuecleanupContinueand return before the target range is actually deleted.restartHandlerthen bumpsgenerationNumberand starts the workflow again, so stale steps from the previous run can survive into the restarted execution.Suggested direction
export async function restartHandler( ctx: MutationCtx, args: Infer<typeof restartArgs>, ) { ... - await deleteStepsFrom(ctx, args.workflowId, fromStepNumber, prefetched); + await deleteStepsFrom(ctx, args.workflowId, fromStepNumber, prefetched, { + allowContinuation: false, + }); ... } async function deleteStepsFrom( ctx: MutationCtx, workflowId: Id<"workflows">, fromStepNumber: number, prefetched?: Doc<"steps">[], + opts: { allowContinuation?: boolean } = {}, ) { + const allowContinuation = opts.allowContinuation ?? true; ... - if (await transactionBudgetMostlyConsumed(ctx)) { + if (allowContinuation && (await transactionBudgetMostlyConsumed(ctx))) { await ctx.scheduler.runAfter(0, internal.workflow.cleanupContinue, { workflowId, fromStepNumber, }); return; } } }Also applies to: 538-570
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/component/workflow.ts` at line 316, The call to deleteStepsFrom can return before the deletion finishes (it may enqueue cleanupContinue), allowing restartHandler to increment generationNumber and restart the workflow while stale steps remain; modify the restart path to ensure deletions complete synchronously before restarting by either (A) adding/using a synchronous deletion API or awaiting the completion signal from deleteStepsFrom (e.g., wait for its cleanupContinue work to finish), or (B) change deleteStepsFrom to return a Promise that resolves only after the target steps are fully removed and have been persisted, then have restartHandler (which updates generationNumber and restarts the workflow) await that Promise so no stale steps survive into the restarted execution; update calls at the shown locations (the deleteStepsFrom invocation at this diff and the other occurrences around lines 538-570) to use the synchronous/awaiting behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/component/workflow.ts`:
- Line 316: The call to deleteStepsFrom can return before the deletion finishes
(it may enqueue cleanupContinue), allowing restartHandler to increment
generationNumber and restart the workflow while stale steps remain; modify the
restart path to ensure deletions complete synchronously before restarting by
either (A) adding/using a synchronous deletion API or awaiting the completion
signal from deleteStepsFrom (e.g., wait for its cleanupContinue work to finish),
or (B) change deleteStepsFrom to return a Promise that resolves only after the
target steps are fully removed and have been persisted, then have restartHandler
(which updates generationNumber and restarts the workflow) await that Promise so
no stale steps survive into the restarted execution; update calls at the shown
locations (the deleteStepsFrom invocation at this diff and the other occurrences
around lines 538-570) to use the synchronous/awaiting behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: get-convex/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7a887a95-9e70-494c-a09e-180ebdebc2db
📒 Files selected for processing (1)
src/component/workflow.ts

Make
cleanupresilient to transaction limitsFixes #255
Workflow cleanup now deletes steps in batches rather than collecting all steps at once. After each batch, the transaction budget is checked, and if it is mostly consumed, a scheduled continuation (
cleanupContinue) is enqueued to pick up where deletion left off.Share
deleteStepsFromrestartHandlerandcleanupnow both route through a singledeleteStepsFromhelper, replacing the olddeleteStepsfunction that operated on a pre-fetched slice.All step deletion is batched
deleteStepBatchaccumulates nested workflow IDs encountered during a batch and enqueues their cleanup in a singleenqueueMutationBatchcall rather than oneenqueueMutationper step.