fix(core): add configurable bodyTimeout to prevent streaming timeout with local models#4667
fix(core): add configurable bodyTimeout to prevent streaming timeout with local models#4667doudouOUC wants to merge 6 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a configurable bodyTimeout to control undici's inter-chunk SSE streaming idle timeout, fixing UND_ERR_BODY_TIMEOUT errors that kill long-running streams to slow local model servers (Ollama, mlx_lm.server). The no-proxy Node.js fetch path now always uses a bundled-undici Agent (instead of Node's built-in fetch) so a custom bodyTimeout (default 0 = disabled) can be applied; the proxy-failure fallback also preserves bodyTimeout instead of dropping it.
Changes:
- Add
bodyTimeout(default0= disabled) toModelGenerationConfig/ContentGeneratorConfigand expose it via CLI settings schema + VS Code companion schema. - Replace the "no dispatcher" fallback in
runtimeFetchOptionswith a cached undiciAgentkeyed bybodyTimeout, with input sanitization (negative/float/NaN/Infinity → 0 with warning) and proxy-failure fallback preserving the timeout. - Thread
bodyTimeoutfromcontentGeneratorConfigintobuildRuntimeFetchOptionscalls in the default/dashscope OpenAI providers and the Anthropic generator; expand tests with 9 new cases.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/src/utils/runtimeFetchOptions.ts | Introduce RuntimeFetchBehaviorOptions, a cached no-proxy Agent, sanitizeBodyTimeout, and updated proxy-failure fallback. |
| packages/core/src/utils/runtimeFetchOptions.test.ts | Update existing assertions to expect a dispatcher in no-proxy/fallback paths and add bodyTimeout coverage. |
| packages/core/src/models/types.ts | Add bodyTimeout to ModelGenerationConfig picked fields. |
| packages/core/src/models/constants.ts | Add bodyTimeout to MODEL_GENERATION_CONFIG_FIELDS. |
| packages/core/src/core/contentGenerator.ts | Add bodyTimeout to ContentGeneratorConfig. |
| packages/core/src/core/openaiContentGenerator/provider/default.ts | Pass bodyTimeout when building runtime fetch options. |
| packages/core/src/core/openaiContentGenerator/provider/dashscope.ts | Pass bodyTimeout when building runtime fetch options. |
| packages/core/src/core/anthropicContentGenerator/anthropicContentGenerator.ts | Pass bodyTimeout when building runtime fetch options. |
| packages/cli/src/config/settingsSchema.ts | Expose model.generationConfig.bodyTimeout in settings schema. |
| packages/vscode-ide-companion/schemas/settings.schema.json | Mirror new bodyTimeout field in VS Code companion settings schema. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
📋 Review SummaryThis PR adds a configurable 🔍 General Feedback
🎯 Specific Feedback🟡 High Priority
🟢 Medium Priority
🔵 Low Priority
✅ Highlights
|
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
chiga0
left a comment
There was a problem hiding this comment.
Review: fix(core): add configurable bodyTimeout to prevent streaming timeout with local models
PR: #4667 | Author: doudouOUC | +230/-66 across 10 files, 1 commit
Summary
Fixes UND_ERR_BODY_TIMEOUT by adding generationConfig.bodyTimeout (default 0 = disabled) and creating a plain undici Agent with configurable body timeout for the no-proxy Node.js fetch path. Root cause: Node.js built-in fetch uses undici's default 300s bodyTimeout between streaming chunks, which kills slow local model servers (Ollama, mlx_lm.server) during tool call generation.
Review Coverage
Reviewed all 10 files: runtimeFetchOptions.ts (core logic), runtimeFetchOptions.test.ts (9 new tests), types.ts, constants.ts, contentGenerator.ts, default/dashscope providers, anthropic generator, settingsSchema.ts.
Findings
0 blocking findings. Clean, well-scoped bug fix.
Highlights:
RuntimeFetchBehaviorOptionsis a clean, extensible interface for behavior optionsgetOrCreateNoProxyDispatchercaches dispatchers bybodyTimeout— no repeated Agent creationsanitizeBodyTimeouthandles all edge cases (undefined/null/non-integer/negative/NaN/Infinity → 0 with warning log)- All 3 code paths (proxy / no-proxy / proxy-failure fallback) now consistently use custom dispatchers pinned to bundled undici — eliminates the undici version-mismatch risk that the old code was avoiding by not pinning
- Proxy-failure fallback now preserves
bodyTimeoutinstead of silently dropping it - All 3 providers (Default OpenAI, DashScope, Anthropic) thread
bodyTimeoutconsistently NO_DISPATCHER_FALLBACKcorrectly removed — every path now uses explicit dispatchers- 9 new tests: default/custom bodyTimeout, cache isolation by value, sanitize edge cases (7 parametrized), proxy-failure fallback preservation, fetch pinning, different dispatchers for different values
requiresRestart: falsein settings schema — timeout changes take effect without restart
LGTM.
— QoderWork automated review
LaZzyMan
left a comment
There was a problem hiding this comment.
Thanks for picking this up — design is right and the test coverage is solid (cache isolation by bodyTimeout value, sanitize edge cases, proxy-failure fallback preservation all covered). Reviewed against #4604 + #4657 as the target use cases.
Discussion request before merge
Default behavior changes for all SDK users, not just local models. Today the no-proxy Node.js path uses built-in fetch with undici's 5-minute idle bodyTimeout. After this PR, the same path uses bundled undici with bodyTimeout: 0 (disabled). That affects hosted-API users (api.openai.com, dashscope, etc.), not just Ollama/mlx_lm.server — the SDK-level timeout (default 120s) becomes the only bound on a hung connection. I think that's fine because the SDK timeout is a strict superset of the per-chunk idle timeout, but it would be worth calling out explicitly in the PR description / commit message so future "stuck request against hosted API" reports don't have to re-derive this from git blame.
Also touches the Anthropic SDK no-proxy path (anthropicContentGenerator.ts), not just OpenAI — worth noting in the PR body since it currently reads as OpenAI-only.
Suggestions
Worth a follow-up commit
keepAliveTimeoutasymmetry —getOrCreateSharedDispatcher(proxy path) setskeepAliveTimeout: 60_000, butgetOrCreateNoProxyDispatcherdoesn't, so it inherits undici's default 4s. For interactive local-model sessions where the user types a prompt every 30s–2min, TCP connections get torn down between turns. Recommend addingkeepAliveTimeout: 60_000to match the proxy path.
Lower priority / can defer
headersTimeout: 0hardcoded — if a user raises SDK timeout to 1h and the server never responds with headers, the request hangs that long. Not a current bug since SDK timeout still bounds it. Defer until someone actually asks.resetDispatcherCache()doesn't close dispatchers —.clear()leaves the oldAgent/ProxyAgentsocket pools dangling. Pre-existing, not introduced here. Easy to fix in a follow-up since you've already addednoProxyDispatcherCache.clear().
Coordination with #4605
This PR is a superset of #4605 (which hardcodes bodyTimeout: 0). Recommend marking #4605 as superseded in the description so the maintainer triage is unambiguous. Whichever lands first, the other will need rebasing.
What looks good
- Cache key by
bodyTimeoutvalue is the right abstraction — avoids per-request allocation while supporting per-model overrides - Proxy-failure fallback now preserves
bodyTimeoutinstead of silently dropping it — quiet but real bug fix on top of the main change sanitizeBodyTimeoutdefensive against float/NaN/Infinity/negative — undici constructor is strict about this, and the warning log is the right level- Type propagation through
ContentGeneratorConfig→ModelGenerationConfig→MODEL_GENERATION_CONFIG_FIELDS→ settings schema is complete; nothing dropped - VSCode companion schema synced — easy to miss
Overall: approve with a small ask to clarify the default-change scope in the PR body and (ideally) one follow-up commit for keepAliveTimeout.
Addressed LaZzyMan's review feedback
|
Addressed wenshao's review (ada094c)
|
Verification Report — PR #4667Commit: Test Results
Test File Breakdown
Execution Environment
VerdictAll 5 checks pass. 136 tests cover the changed runtimeFetchOptions and provider files (including 9 new tests for bodyTimeout). The CLI typecheck error ( |
Addressed wenshao's round 2 review (5d45ffa)
|
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] The comment in packages/cli/src/utils/apiPreconnect.ts:170-173 says "SDK uses built-in fetch with its own connection pool" and "mirrors buildFetchOptionsWithDispatcher() which also skips custom dispatcher creation when no proxy is set" — both claims are now false since this PR creates a custom Agent for the no-proxy path. Consider updating the comment to reflect the new behavior, e.g.:
// No-proxy: SDK now uses a custom Agent (noProxyDispatcherCache),
// but preconnect fires before the Agent is created and uses a different
// pool. Keep skipping — the Agent's lazy creation on first SDK call
// is sufficient for the no-proxy path.
— qwen3.7-max via Qwen Code /review
wenshao round 3 — all deferred/not-taking (no code changes)
No code changes this round — all suggestions are either edge-case hardening or test coverage enhancements that don't affect the core bug fix. |
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ — qwen3.7-max via Qwen Code /review
…with local models Node.js built-in fetch uses undici's default 300s bodyTimeout between SSE chunks. Local model servers (Ollama, mlx_lm.server) generating tool call JSON can pause longer than this, causing "Body Timeout Error". The proxy and Bun paths already disabled this timeout but the no-proxy Node.js path did not. Add `generationConfig.bodyTimeout` (default 0 = disabled) so all Node.js paths now use a plain undici Agent with configurable body timeout. Invalid values are sanitized at runtime. Proxy-failure fallback also preserves the user's bodyTimeout setting. Fixes #4604 Fixes #4657
Match the proxy path's keepAliveTimeout: 60_000 so TCP connections to local model servers survive between interactive turns (default 4s is too short for 30s–2min prompt intervals).
…ptions
The no-proxy path now always returns an object (never undefined), so the
type union with undefined is dead. Remove it and drop the || {} guards
in callers. Also update the stale comment block describing fetch behavior.
The type no longer includes undefined, so test mocks must return {}
instead of undefined to satisfy TypeScript.
…ertions - Log debug warning when bodyTimeout is set on Bun runtime (ignored) - Add test pinning proxy path contract (ignores user bodyTimeout) - Add mockWarn assertions to sanitize edge-case tests
- Prefix sdkType with _ in buildFetchOptionsWithDispatcher (unused after NO_DISPATCHER_FALLBACK removal but kept for overload dispatch) - Fix timeout comment: "hard deadline from request start" not "time-to-first-headers" (SDK uses AbortSignal.timeout)
5d45ffa to
31b9453
Compare
Closing — superseded by #4605The core fix (disabling undici's 300s bodyTimeout for the no-proxy Node.js path) has been merged via #4605. This PR added user-configurable If a need for user-configurable bodyTimeout arises in the future, this branch can be revisited. |
Summary
generationConfig.bodyTimeoutfield (default0= disabled) to control the idle timeout between SSE streaming chunksAgentwith configurablebodyTimeoutfor the Node.js no-proxy fetch path, fixing the 300s default that kills slow local model connectionsbodyTimeoutvalues (negative, float, NaN) at runtime with a warning logbodyTimeoutin the proxy-failure fallback path instead of silently dropping itkeepAliveTimeout: 60_000to no-proxy Agent for TCP connection reuse between interactive turnsRoot Cause
Node.js built-in fetch uses undici's default
bodyTimeout: 300000(300 seconds) between streaming chunks. Local model servers (Ollama, mlx_lm.server) generating tool call JSON on slow hardware can pause longer than 300s, triggeringUND_ERR_BODY_TIMEOUT. The proxy path already setbodyTimeout: 0and the Bun path usedtimeout: false, but the Node.js no-proxy path—used by most local model users—had no override.Scope of default behavior change
This PR changes the default behavior for ALL Node.js no-proxy SDK users, not just local model users. Previously, the no-proxy path used Node's built-in fetch with undici's 5-minute idle
bodyTimeout. After this PR, it uses the bundled undici withbodyTimeout: 0(disabled). This means the SDK-leveltimeout(default 120s) becomes the only bound on a hung connection. This is safe because:Both OpenAI and Anthropic SDK paths are affected — all three call sites (
DefaultOpenAICompatibleProvider,DashScopeOpenAICompatibleProvider,AnthropicContentGenerator) now passbodyTimeouttobuildRuntimeFetchOptions.Supersedes #4605
This PR is a superset of #4605 (which hardcodes
bodyTimeout: 0). This PR additionally provides user-configurablebodyTimeout, input validation, proxy-failure fallback preservation, andkeepAliveTimeoutfor connection reuse. Whichever lands first supersedes the other.User Configuration
{ "model": { "generationConfig": { "bodyTimeout": 0 } } }Or per-model in
modelProviders:{ "modelProviders": { "openai": [{ "id": "qwen3.6", "baseUrl": "http://localhost:11434/v1", "generationConfig": { "bodyTimeout": 0 } }] } }Test Plan
runtimeFetchOptions.test.ts— 55 tests pass (9 new: default/custom bodyTimeout, cache isolation, sanitize edge cases, proxy-failure fallback preservation)openaiContentGenerator/— 444 tests pass, no regressionsUND_ERR_BODY_TIMEOUTwith mock slow SSE server; verified fix withbodyTimeout: 0Fixes #4604
Fixes #4657
🤖 Generated with Qwen Code