fix(agent): gate tool dispatch on ToolUse terminal to ignore truncated calls#1676
Merged
Conversation
This was referenced Jun 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1666
Summary
A truncated assistant turn (provider stop reason
Length/MAX_TOKENS, a contentfilter such as
Refusal/Sensitive, or a stream that ended without a terminalevent) that nonetheless surfaced a parsed
ToolCallwas being executed. Theagent loop decided dispatch purely on the presence of a tool call, with no
gate on the finish reason, so a half-formed tool call cut off at the token limit
could run with incomplete/garbled arguments.
This is the BotNexus analogue of the OpenClaw fix that ignores truncated tool
calls (
fix(agent-core): ignore truncated tool calls). The posture isconservative: only a
ToolUseterminal authorizes execution.Changes
AgentLoopRunner.RunLoopAsync(primary guard): tool dispatch is now gatedon the terminal reason in addition to presence:
A truncated turn no longer reaches
ToolExecutor.ExecuteAsync.StreamAccumulator(backstop / replay-safety): addedStripNonExecutableToolCalls, applied on both theDoneEventfinal-messagepath and the stream-EOF fallback path. When the terminal is not
ToolUse, anysurfaced tool call is dropped from both
AssistantAgentMessage.ToolCallsand the
ContentBlocksmirror, while the visible text and finish reasonare retained. This matters because
MessageConverter.ToAgentMessagepopulatesContentBlockswith the raw provider content, andToProviderAssistantMessageprefersContentBlockswhen re-serializing amessage for a later LLM call -- so nulling only
ToolCallswould leave are-dispatchable tool call hiding in
ContentBlocks. A genuineToolUseturnis returned unchanged.
The two changes are complementary defense-in-depth: the loop guard blocks live
dispatch; the accumulator strip ensures the recorded/replayed message cannot be
re-dispatched by a later code path. The new loop regression test asserts both
properties, so it fails if either fix is missing.
Stop-vs-Length posture decision (why the guard is
== ToolUse)The issue notes that "a normal
Stopwith a complete tool call should keep itsexisting promotion behaviour at the provider/parser layer; the loop-level guard
is the backstop", and asks to narrow the strip to only truncation terminals if
any provider legitimately emits a complete tool call under a non-
ToolUseterminal. I verified every provider's stop-reason mapping before choosing the
broad guard:
AnthropicProvider.MapStopReason):tool_use -> ToolUse,max_tokens -> Length.CopilotMessagesProvider):tool_use -> ToolUse,max_tokens -> Length.CompletionsStreamEngine.MapStopReason):tool_calls/function_call -> ToolUse,length -> Length.ResponsesStreamParser): explicitly promotes acomplete tool call from
StoptoToolUse(
if (contentBlocks.OfType<ToolCallContent>().Any() && stopReason == Stop) stopReason = ToolUse;),and maps
incomplete -> Length.OpenAICompatProvider):tool_calls/function_call -> ToolUse,and
null + hasToolCalls -> ToolUse;length -> Length.ToolUse;length -> Length.Conclusion: every legitimate, complete tool call is promoted to
StopReason.ToolUseat the provider/parser layer; a truncated call retainsLength(orRefusal/Sensitivefor content filters,Stoponly when notool call is present). No provider emits a complete tool call under a
non-
ToolUseterminal. Therefore the broad guardFinishReason == ToolUse(block "anything not ToolUse" that carries a tool call) is correct -- it blocks
only the truncated/filtered case and never a real tool turn -- and it mirrors
OpenClaw's conservative posture without needing to enumerate
Length/content-filter / EOF explicitly. The strip is symmetric (also "not ToolUse"),
which is safe for the same reason and additionally covers the stream-EOF
fallback the issue calls out.
Tests (TDD, RED -> GREEN)
New tests written before the fix:
AgentLoopRunnerTruncatedToolCallTestsTruncatedToolCallTurn_IsNotDispatchedAndIsStrippedFromPersistedMessage:a scripted
[StopReason.Length, content=[text, partial toolCall]]turn mustNOT invoke the recording tool (
ExecuteCount == 0) and the persistedassistant message must retain text +
Lengthfinish reason but carry no toolcall (neither
ToolCallsnor aToolCallContentblock), and no tool-resultmessage is produced.
CompleteToolUseTurn_IsDispatched: a normalToolUseturn still dispatches(
ExecuteCount == 1, one tool-result message) -- no regression of legitimatetool execution.
StreamAccumulatorTruncatedToolCallTests: unit-level strip behaviour forLengthandSensitiveterminals (tool call stripped, text + reason kept) andToolUse(tool call retained).RED -> GREEN proof (manual, non-incremental rebuild between each toggle to avoid
the stale-DLL trap):
ExecuteCountis 1.ExecuteCountpasses (guard blocks livedispatch) but the persisted-message assertion fails -- proving the strip is
independently load-bearing for replay-safety.
Full impacted gate:
scripts/repo/test-impacted.ps1-> All impacted testspassed (22 projects, including the always-run
Architecture.TestsandScenarios.Tests;BotNexus.Agent.Core.Testsreports 217 passed / 0 failed).Merge Notes
This change is file-disjoint from the other open PRs (#1672, #1673, #1674,
#1675): it touches only
BotNexus.Agent.Coreloop code(
Loop/AgentLoopRunner.cs,Loop/StreamAccumulator.cs) and adds two new testfiles under
BotNexus.Agent.Core.Tests/Loop/. No shared files, no provider orextension changes. Safe to merge in parallel with those PRs.