fix(telemetry): clear span dedup state after chat compression (#3731)#4660
Conversation
After chat compression the conversation is rebuilt and system prompts / tool schemas are re-injected, but the module-level seenHashes Set was never cleared, so post-compaction spans silently lost full content (emitting only hashes). Wire clearDetailedSpanState() into the COMPRESSED branch of GeminiChat.tryCompress() — the single convergence point for all compression paths (pre-send rescue, reactive overflow, manual /compress, and ACP).
📋 Review SummaryThis PR correctly addresses a telemetry data loss issue where OpenTelemetry spans lost full system prompt and tool schema content after chat compression in long-running sessions. The implementation is minimal, focused, and well-tested—adding a single call to 🔍 General Feedback
🎯 Specific Feedback🔵 Low
✅ Highlights
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a telemetry deduplication edge case where post-compression spans could stop emitting full system prompt / tool schema bodies (falling back to hash/preview only) because the process-global seenHashes state was not being cleared when a chat is compacted.
Changes:
- Clears detailed-span dedup state on successful
GeminiChat.tryCompress()compaction to ensure rebuilt prompts/tools re-emit full content on subsequent spans. - Exports
clearDetailedSpanState()from the telemetry barrel (packages/core/src/telemetry/index.ts) for broader internal use. - Adds a regression test confirming tool schema full bodies are emitted again after clearing dedup state; updates the
seenHashescomment to match runtime behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| packages/core/src/telemetry/index.ts | Re-exports clearDetailedSpanState() from the telemetry barrel. |
| packages/core/src/telemetry/detailed-span-attributes.ts | Updates seenHashes comment to reflect that dedup state is cleared after chat compression. |
| packages/core/src/telemetry/detailed-span-attributes.test.ts | Adds a test verifying tool schema events are re-emitted after clearDetailedSpanState(). |
| packages/core/src/core/geminiChat.ts | Calls clearDetailedSpanState() after successful chat compression. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ Clean, focused change that correctly wires clearDetailedSpanState() into the compression success path. All compression paths (auto, reactive, manual) converge on tryCompress(), so the single call site provides complete coverage. The updated comment and new test are solid. — qwen3.7-max via Qwen Code /review
Verification ReportEnvironment: macOS Darwin 25.4.0 (Apple Silicon), Node.js, tmux parallel execution Test Results
Code Review Observations
Verdict✅ Ready to merge — Focused, minimal fix with proper test coverage. All tests pass, types check cleanly, builds succeed. Verified by: wenshao |
tanzhenxin
left a comment
There was a problem hiding this comment.
Review
This wires the existing span-dedup clear into the single compaction convergence point, so that after a chat compression rebuilds the conversation and re-injects the system prompt and tool schemas, the next telemetry span re-emits their full bodies instead of silently dropping them to hashes. The fix is correct and well-placed — the compaction path it hooks is the common convergence point for all compaction triggers (hard rescue, reactive overflow, manual compress, ACP), so one call site covers them all, and it runs before the next request's span is built. It reuses an already-tested helper rather than adding new state.
One non-blocking note: the dedup set stays process-global and shared across concurrent sessions and subagents, so a sibling session compacting will clear it out from under the main session — but that only ever causes a harmless re-emit of full content, never a dropped span, and the PR acknowledges it. The added test covers the clear-then-re-emit behavior; an assertion that the compaction path actually invokes the clear would guard against a future refactor silently dropping the wiring.
Verdict
APPROVE — correct, minimal, well-targeted fix; the noted items are non-blocking.
Summary
clearDetailedSpanState()intoGeminiChat.tryCompress()COMPRESSED branch so post-compaction OTel spans re-emit full system prompt / tool schema content instead of just hashesclearDetailedSpanStatefrom telemetry barrel (index.ts)seenHashesthat claimed "intentionally never cleared in production"Context
detailed-span-attributes.tsuses a module-levelseenHashes: Set<string>to deduplicate system prompt / tool schema full content on spans. After chat compression, the conversation is rebuilt and these contents are re-injected, butseenHasheswas never cleared — post-compression spans silently lost full content (emitting only hash + preview). This matters for long-running daemon/qwen-serve sessions with multiple compressions.tryCompress()is the single convergence point for all compression paths (pre-send hard-tier rescue, reactive overflow, manual/compress, ACP), so one call site covers everything.Test plan
npx vitest run src/telemetry/detailed-span-attributes.test.ts— 29 tests pass (1 new)npx vitest run src/core/geminiChat.test.ts— 150 tests passnpx tsc --noEmit— cleanOut of scope
/clear,/reset,/resumepaths also don't clearseenHashes— pre-existing gap, separate follow-upseenHashes— pre-existing design, clearing is benign (re-emitting full content is better than losing it)Closes part of #3731 (P3 deeper observability —
clearDetailedSpanStateinto chat compression cleanup checklist item)🤖 Generated with Qwen Code