reflect: gpt-oss harmony tool-call tokens leak into answer text (litellm MaaS, v0.8.2)

## Summary

`reflect` intermittently returns the model's **raw harmony tool-call scaffolding as the answer text** instead of a synthesized answer, when the reflect LLM is a gpt-oss / harmony-format model (e.g. `vertex_ai/openai/gpt-oss-120b-maas` via litellm). The `text` field comes back as e.g.:

```
<|start|>assistant<|channel|>commentary to=functions.recall<|message|>{"query": "..."}<|call|>
```

with `based_on` empty (no tool was actually executed). A second variant leaks the `done(...)` payload as a bare JSON blob:

```
{"answer":"I'm sorry, but I don't have that information."}
```

Reproduced on **v0.8.2**.

## Impact

The reflect endpoint silently returns corrupt, unsynthesized output as if it were a valid answer. Downstream consumers can't distinguish it from a real low-confidence answer (it has empty citations + plausible-looking `text`), so the garbage surfaces to end users.

## Root cause

Two layers combine:

1. **litellm provider doesn't parse harmony tool calls** — `engine/providers/litellm_llm.py::call_with_tools` (around lines 360 / 366): it reads `content = message.content` and only populates `tool_calls` when `message.tool_calls` is set by litellm. The `vertex_ai/openai/gpt-oss-*-maas` adapter intermittently fails to parse the model's harmony tool-call turn into structured `message.tool_calls`, so the raw `<|channel|>...to=functions...<|call|>` text lands in `message.content` and `tool_calls` is empty.

2. **The reflect agent treats leftover content as a final answer** — `engine/reflect/agent.py:744` (`if not result.tool_calls:`) interprets empty-`tool_calls` + non-empty-`content` as "the LLM wants to respond with text," and at line 753-754 returns `_clean_answer_text(result.content.strip())`. But `_clean_answer_text` (lines 101-109) **only strips `done(...)` patterns** — it has no handling for harmony channel/tool-call tokens, so the scaffolding passes through verbatim. With zero tools executed, `based_on.memories` is empty too → low confidence, no citations.

## Reproduction

Against a bank with data, with the reflect model set to `vertex_ai/openai/gpt-oss-120b-maas` (litellm), call reflect repeatedly:

```bash
curl -s -X POST http://localhost:8888/v1/default/banks/<bank>/reflect \
  -H 'Content-Type: application/json' \
  -d '{"query":"summarize recent activity","budget":"high"}'
```

It is **probabilistic and budget-correlated**: in our measurements `high` leaked ~4/6 runs, `low` ~1/3, some queries clean 5/5. More iterations (higher budget, larger context) = more tool-call turns = more chances for litellm to drop one. So `mid`/`low` are not safe either, just less frequent.

## Suggested fix (either, ideally both)

- **A (real fix):** ensure litellm parses gpt-oss harmony channels into structured `tool_calls` for the Vertex MaaS gpt-oss endpoint (correct `custom_llm_provider` / parsing), so the agent never sees raw harmony tokens.
- **B (defense in depth, in `agent.py`):** at the `if not result.tool_calls:` branch, before accepting `result.content` as the answer, detect harmony / raw-tool-call markers (`<|start|>`, `<|channel|>`, `<|message|>`, `<|call|>`, `to=functions`) or a bare `{"answer":...}` blob, and treat it as a failed tool-parse — re-parse the harmony turn into a real tool call and continue the loop, or fall through to the existing `build_final_prompt` final-synthesis path (which already handles the no-content case at lines 817+). Extending `_clean_answer_text` to also strip/reject harmony scaffolding would at minimum stop the corrupt text from being surfaced.

Happy to provide more raw samples if useful.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reflect: gpt-oss harmony tool-call tokens leak into answer text (litellm MaaS, v0.8.2) #2222

Summary

Impact

Root cause

Reproduction

Suggested fix (either, ideally both)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

reflect: gpt-oss harmony tool-call tokens leak into answer text (litellm MaaS, v0.8.2) #2222

Description

Summary

Impact

Root cause

Reproduction

Suggested fix (either, ideally both)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions