Skip to content

fix(reflect): don't surface raw harmony tool-call scaffolding as the answer (#2222)#2249

Open
r266-tech wants to merge 1 commit into
vectorize-io:mainfrom
r266-tech:fix-reflect-harmony-tool-call-leak-2222
Open

fix(reflect): don't surface raw harmony tool-call scaffolding as the answer (#2222)#2249
r266-tech wants to merge 1 commit into
vectorize-io:mainfrom
r266-tech:fix-reflect-harmony-tool-call-leak-2222

Conversation

@r266-tech

Copy link
Copy Markdown
Contributor

Fixes #2222.

Problem

When the reflect LLM is a gpt-oss / harmony-format model served through litellm (e.g. vertex_ai/openai/gpt-oss-*-maas), the provider intermittently fails to parse a tool-call turn into structured tool_calls. The raw harmony scaffolding then lands in message.content (call_with_tools reads message.content and only builds tool_calls from message.tool_calls), and the reflect agent's no-tool-calls branch accepts that content verbatim as the final answer:

<|start|>assistant<|channel|>commentary to=functions.recall<|message|>{"query": "..."}<|call|>

_clean_answer_text only strips done(...) suffixes, so the harmony tokens pass through. With no tool executed, citations are empty too — the corrupt, unsynthesized output is indistinguishable from a real low-confidence answer and surfaces to end users (reported on v0.8.2).

Fix

In the if not result.tool_calls: branch, detect leaked tool-call scaffolding before accepting result.content as the answer. When detected, fall through to the existing forced final-synthesis path (build_final_prompt), which re-synthesizes from the gathered evidence instead of echoing the scaffolding.

Detection (_looks_like_unparsed_tool_call) is intentionally narrow to avoid rejecting valid answers:

  • Harmony scaffolding is flagged only when a frame token (<|start|>/<|channel|>/<|message|>) co-occurs with call evidence (a <|call|> commit token or the to=functions/to=tool routing namespace). An answer that merely quotes one of these tokens while explaining the format has no such pairing and is left untouched.
  • A done() payload emitted as bare JSON is flagged only when the object carries the tool's internal citation id fields (memory_ids/mental_model_ids/observation_ids), which never appear in a real free-text answer. A plain {"answer": ...} blob with no id fields is deliberately left alone, being indistinguishable from a valid JSON answer.

Known trade-off: an answer that embeds a full harmony tool-call example verbatim (frame token + call evidence together) would be re-synthesized rather than echoed. This is accepted as non-destructive (the fallback reuses the same gathered evidence; no data loss or error) and vanishingly rare for a memory-synthesis answer.

Tests

tests/test_reflect_agent.py:

  • TestUnparsedToolCallDetection — positive cases (harmony call structures + leaked done-JSON with id fields) and negative cases (normal prose, answers explaining harmony tokens, bare {"answer": ...} JSON).
  • test_unparsed_harmony_scaffolding_is_not_returned_as_answer / test_leaked_done_payload_with_ids_is_not_returned_as_answer — assert the corrupt content never reaches result.text and the agent routes to final synthesis.

This is the defense-in-depth (option B) from the issue; a deeper litellm-side harmony parser (option A) is out of scope here.

…answer (vectorize-io#2222)

When a provider (e.g. gpt-oss / harmony-format models via litellm Vertex MaaS)
fails to parse a tool-call turn into structured tool_calls, the raw harmony
channel scaffolding (or a done() payload emitted as bare JSON with the tool's
internal citation id fields) lands in message.content with tool_calls empty.
The reflect agent then accepted that content verbatim as the final answer,
surfacing corrupt, unsynthesized text (with empty citations) to end users.

Detect that leaked scaffolding in the no-tool_calls branch and route it to the
existing forced final-synthesis path instead of returning it. Detection is
narrowly gated to avoid rejecting valid answers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reflect: gpt-oss harmony tool-call tokens leak into answer text (litellm MaaS, v0.8.2)

1 participant