fix(copilot): bound streaming SSE body to prevent unbounded read OOM by agent-farnsworth[bot] · Pull Request #1674 · sytone/botnexus

agent-farnsworth · 2026-06-27T19:28:35Z

Summary

The GitHub Copilot Anthropic-Messages SSE streaming parser
(CopilotMessagesStreamParser.ProcessStreamAsync) read the success-response
body line-by-line with no accumulated-byte tracking and no cap. A hostile or
malfunctioning Copilot endpoint that streams an unbounded SSE body -- or a
single never-terminating data: line with no newline -- would buffer without
bound and could exhaust gateway memory (OOM-DoS of the gateway).

This is the streaming complement to #1653, which added
BoundedHttpContent.ReadFromJsonWithLimitAsync and bounded only the
non-streaming Copilot path (discovery JSON + the error-response body). The
streaming success path was still unbounded; this PR closes that gap.

Changes

New ByteCountingStream (src/agent/BotNexus.Agent.Providers.Copilot/Streaming/ByteCountingStream.cs):
a small, reusable, read-only Stream wrapper -- the C# equivalent of
OpenClaw's createSseByteGuard. It tracks, as bytes are read:
- a total cap (16 MiB) -- aggregate bytes across the whole body, and
- a per-frame cap (64 KiB) -- bytes since the last newline (\n).
On either overflow it throws the canonical
ResponseContentTooLargeException (reused from BoundedHttpContent, [Security] Bound Copilot discovery/auth JSON reads (unbounded ReadFromJsonAsync -> OOM) #1653)
and aborts the read mid-flight, so the oversized body is never fully
materialized. The frame counter resets at every \n, so normal multi-frame
streams never trip it.
Bounded the read loop in CopilotMessagesStreamParser.ProcessStreamAsync:
the parser now receives the raw response Stream, wraps it in a
ByteCountingStream, and constructs its StreamReader over the bounded
stream. Because every byte the reader consumes flows through the guard, an
unbounded body or a single endless data: line trips the cap regardless of
how StreamReader.ReadLineAsync buffers internally (this is what catches the
"single never-terminating line with no \n" case).
Updated the single call site in
CopilotMessagesProvider.StreamCoreAsync to hand the raw response stream to
the parser (the parser now owns the bounded reader). The using StreamReader
that previously lived in the provider moved into the parser.

Cap values

Total cap reuses the existing constant directly:
private const long MaxResponseBytes = BoundedHttpContent.DefaultMaxResponseBytes;
(16 MiB). DefaultMaxResponseBytes is public const in
BotNexus.Agent.Providers.Core.Utilities and the Copilot project already
references that namespace, so no duplicated literal was introduced -- the
streaming and non-streaming paths agree on a legitimate body size by
construction.
Per-frame cap is a local const long MaxFrameBytes = 64L * 1024; (64 KiB),
documented inline.

The overflow error shape is aligned with the non-streaming path: the same
ResponseContentTooLargeException(maxBytes, observedBytes) type is thrown, so
the provider's existing catch (Exception ex) surfaces it as a
StopReason.Error stream result.

Behaviour for normal streams is unchanged

The chunk-by-chunk parsing is byte-identical for well-formed, under-cap
streams; the guard only trips on overflow. Verified by a dedicated regression
test plus the full pre-existing Copilot wire-replay suite.

Tests (TDD)

Added tests/agent/BotNexus.Agent.Providers.Copilot.Tests/Messages/CopilotMessagesStreamGuardTests.cs
(7 tests):

Total body over cap (e2e through the real provider + parser) -- a 200 SSE
response whose total body exceeds 16 MiB surfaces the overflow error and is
not buffered unbounded.
Single never-terminating data: line (e2e) -- one ~512 KiB data: line
with no \n trips the per-frame cap (stays under the total cap, so it
exercises the frame guard specifically).
Normal well-formed stream (e2e regression guard) -- parses correctly and
is unaffected (non-error, populated usage, non-empty content).
4-7. Direct unit tests for ByteCountingStream: pass-through under cap,
total-cap trip, per-frame-cap trip, and frame-counter-resets-on-newline.

All feed bytes via MemoryStream / a custom hostile Stream.

RED -> GREEN proof (honest)

Wrote the tests first.
After implementing the guard, neutralized it (set both parser caps to
long.MaxValue), forced
dotnet build <Copilot.Tests>.csproj --no-incremental, then ran the filtered
suite: the two overflow e2e tests FAILED genuinely
(result.StopReason should be StopReason.Error but was StopReason.Stop -- the
17 MiB body and the 512 KiB line were read unbounded). The 4 direct
ByteCountingStream tests stayed green because they pass explicit small caps
via the constructor, proving the e2e tests truly depend on the parser's cap
enforcement.
Restored the caps, force-rebuilt with --no-incremental again (to defeat
the stale-DLL false-green trap), and re-ran: 7/7 passed.

Validation

scripts/repo/test-impacted.ps1 from the worktree: printed
All impacted tests passed. (9 projects, incl. Architecture.Tests and
Scenarios.Tests safety nets; Copilot tests 120/120).
Build clean: 0 warnings, 0 errors.
All added source lines are ASCII-only (no em-dash; ASCII quotes).

Merge Notes

File-disjoint from all currently-open PRs (refactor(prompt): hoist GetGatewayData, PascalCase publics, named section order #1660, fix(agent): audit claims per-turn so multi-turn fabrication is not masked #1662, fix(telegram): deliver tool activity as standalone messages with shared cross-channel icon #1664, docs: daily documentation grooming 2026-06-27 #1665,
fix(cron): purge retention on real terminal statuses (ok/error/timed_out) #1669, refactor(blazor-client): extract ToChatMessage factory, split history loader, unify user echo #1670, refactor(subagent): decompose SpawnAsync into named helpers #1671, fix(blazor-client): observe reconnect fire-and-forget and synchronize shared HashSet state #1672, fix(cron): scope cron create/update target agent to the calling agent #1673). Those touch BotNexus.Gateway,
BotNexus.Cron, BotNexus.Agent.Core, and the Blazor client; this PR is
confined entirely to src/agent/BotNexus.Agent.Providers.Copilot/** and its
test project. No expected merge conflicts.
Builds directly on [Security] Bound Copilot discovery/auth JSON reads (unbounded ReadFromJsonAsync -> OOM) #1653 (non-streaming Copilot bound) and intentionally
reuses its BoundedHttpContent.DefaultMaxResponseBytes constant and
ResponseContentTooLargeException type rather than redefining them.
Anthropic streaming parser deferred (follow-up).
AnthropicStreamParser.ProcessStreamAsync shares the identical read-loop
shape and has the same unbounded-read exposure, but it lives in a separate
project (BotNexus.Agent.Providers.Anthropic). Applying the guard there would
require either duplicating ByteCountingStream or relocating it to
BotNexus.Agent.Providers.Core, which broadens scope and risk. Per the
issue's guidance to "prefer a tight, single-surface PR," it is left as a
follow-up (move the guard to Core, then apply it to the Anthropic parser
with its own tests).

The Copilot Anthropic-Messages SSE parser read the success-response body line-by-line with no accumulated-byte tracking and no cap. A hostile or broken endpoint that streams an unbounded SSE body -- or a single never-terminating data: line with no newline -- would buffer without bound and could exhaust gateway memory (OOM-DoS). This is the streaming complement to #1653, which bounded only the non-streaming Copilot path via BoundedHttpContent. - Add ByteCountingStream: a reusable read-only Stream wrapper in the Copilot provider that tracks total bytes against a 16 MiB cap (matching BoundedHttpContent.DefaultMaxResponseBytes) and bytes-since-newline against a 64 KiB per-frame cap, throwing the canonical ResponseContentTooLargeException on overflow and aborting the read mid-flight. - Bound the read loop in CopilotMessagesStreamParser.ProcessStreamAsync by wrapping the response stream before the StreamReader, so an unbounded body or a single endless data: line trips the cap regardless of how the reader buffers. - Well-formed, under-cap streams are byte-identical: the guard only trips on overflow; normal parsing is unchanged. The Anthropic streaming parser shares the same loop shape but lives in a separate project and reusing the guard would require relocating it to Core; deferred to a follow-up to keep this a tight, single-surface change. Closes #1668

sytone · 2026-06-27T23:03:36Z

CI Health Check -- PR #1674

Check	Status
impacted-tests	pass
CodeQL	pass
Analyze (csharp)	pass
Code Pattern Checks	pass
Dependency Security Audit	pass
Secret Scanning (TruffleHog)	pass

Branch: fix/copilot-sse-byte-guard | Behind main: 0 commits | Mergeable: MERGEABLE

Actions taken:

Synced branch with main (merged origin/main, behindBy 7 -> 0); CI re-triggered.

Blockers for Jon:

None

2026-06-27 23:03 UTC

Farnsworth (automated CI monitor) -- BotNexus -- Last updated: 2026-06-27 23:03 UTC

agent-farnsworth Bot mentioned this pull request Jun 27, 2026

[Security] Copilot streaming SSE parser reads unbounded body (OOM DoS) -- streaming complement to #1653 #1668

Closed

chore: merge main into fix/copilot-sse-byte-guard

9cdd681

sytone merged commit e41a7b2 into main Jun 28, 2026
12 checks passed

sytone deleted the fix/copilot-sse-byte-guard branch June 28, 2026 07:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(copilot): bound streaming SSE body to prevent unbounded read OOM#1674

fix(copilot): bound streaming SSE body to prevent unbounded read OOM#1674
sytone merged 2 commits into
mainfrom
fix/copilot-sse-byte-guard

agent-farnsworth Bot commented Jun 27, 2026

Uh oh!

sytone commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

agent-farnsworth Bot commented Jun 27, 2026

Summary

Changes

Cap values

Behaviour for normal streams is unchanged

Tests (TDD)

RED -> GREEN proof (honest)

Validation

Merge Notes

Uh oh!

sytone commented Jun 27, 2026

CI Health Check -- PR #1674

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant