Scan only bridged attribution tail#1805
Merged
Merged
Conversation
ApprovabilityVerdict: Needs human review This change modifies how attribution spans are computed by slicing token indices and adjusting offsets. While the intent appears to be fixing attribution for reused bridge tokens, changes to core attribution logic can have subtle downstream effects and warrant human verification. You can customize Macroscope's approvability policy. Learn more. |
b72f5c3 to
b422a14
Compare
mikasenghaas
approved these changes
Jun 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Reduce post-inference work for V1 bridged turns by computing prompt message spans from only the bridge-added attribution tail.
Why
Renderer bridges keep prior tokens verbatim and mark that reused portion with
message_indices = -1.PendingTurn.prompt_message_spans()previously calledmessage_token_spans()on the complete bridged attribution, forcing a synchronous Python pass over millions of reused entries even though the graph only needs spans for newly rendered messages. Reused messages already own their tokens in existing graph nodes.What changed
Create a lightweight
RenderedTokensview overmessage_indices[path_len:]and reuse its existingmessage_token_spans()behavior. The resulting tail-relative spans are offset bypath_len, while reused prompt messages remainNone. This preserves missing-message spans, internal scaffold attribution, and full-prompt coordinates without scanning the reused prefix.Performance and resources
For 2,000,000 reused attribution entries plus a 110-entry tail:
The full attribution input, output shape, task count, and connection count are unchanged.
Note
Low Risk
Localized graph attribution helper change with intended behavior preserved; main effect is performance on long bridged prompts, not auth or data handling.
Overview
PendingTurn.prompt_message_spans()no longer runsmessage_token_spans()over the full bridged renderer attribution. It builds aRenderedTokensview onmessage_indices[path_len:](the newly rendered tail only), derives spans from that slice, then returnsNonefor each reused prefix message and shifts non-Nonetail spans bypath_lenso coordinates still match the full prompt.This keeps the same span list shape and semantics for
turn.commit()(viaresponse_from_generatewhenbridged_turnis set) while avoiding a synchronous scan over millions of bridge-reused attribution entries marked unattributed in the prefix.Reviewed by Cursor Bugbot for commit b422a14. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Fix
PendingTurn.prompt_message_spansto scan only the bridged attribution tailPreviously,
prompt_message_spanscomputed token spans across the entire prompt, including reused bridge tokens. It now slicesmessage_indicesfrompath_lenonward, computes spans only for the tail, and offsets each non-None span bypath_lento produce full-prompt-relative positions. The prefix and any reused bridge tokens are returned asNone.Macroscope summarized b422a14.