Skip to content

[BUG CLIENT]: Connection leak: streaming requests that return an error status are never closed #532

@ronnix

Description

@ronnix

Python -VV

Python 3.13.13 (main, May  3 2026, 21:46:01) [Clang 21.0.0 (clang-2100.0.123.102)]

Pip Freeze

mistral-common==1.11.0
mistralai==2.3.2

Reproduction Steps

Fire ~20+ concurrent chat.stream_async calls on a single shared Mistral
client against an endpoint that returns errors (e.g. 429 rate limits, or any
4xx/5xx). After enough leaked connections accumulate, subsequent calls fail with
httpx.PoolTimeout. Creating a fresh client per call works around it (the leaked
connections are reclaimed when the whole client is garbage-collected).

Expected Behavior

The httpx pool is not exhausted.

Additional Context

Summary

When a streaming request (chat.stream_async, stream_async, etc.) receives an
error HTTP status, the SDK raises an exception (or retries) without closing
the underlying streaming httpx.Response
. The pooled connection stays checked
out forever. Under concurrency with a shared client, hitting enough 429s exhausts
the httpx connection pool and subsequent calls fail with httpx.PoolTimeout.

Affected versions

Confirmed identical from 2.3.2 through 2.4.5 — the Speakeasy-generated HTTP
layer is unchanged across these releases.

Root cause

For a streaming request, client.send(req, stream=True) returns a Response
that holds a connection from the pool until the body is fully consumed or
aclose() is called. Two code paths leak it on an error status:

  1. src/mistralai/client/basesdk.pydo_request_async: when the response
    status matches error_status_codes, it invokes the error hook and raises
    errors.SDKError("Unexpected error occurred", http_res). With stream=True,
    http_res is a live streaming response that is never aclose()d.

  2. src/mistralai/client/utils/retries.pyretry_async /
    retry_with_backoff_async:
    for a retryable status (429/500/502/503/504),
    the inner do_request raises TemporaryError(res); retry_with_backoff_async
    then sleeps and retries, dropping res (the open streaming response) without
    closing it. Each retried attempt leaks one connection.

The synchronous variants (retry / retry_with_backoff, do_request) have the
same problem.

Non-streaming requests are unaffected: with stream=False, httpx reads the
full body and releases the connection back to the pool.

Suggested Solutions

Close the streaming response before raising or retrying on an error status:

  • In do_request_async / do_request: when stream=True, await http_res.aclose()
    (resp. http_res.close()) before raising SDKError.
  • In retry_async / retry: close res before raising TemporaryError, or in
    the retry loop (retry_with_backoff_async / retry_with_backoff) before the
    next attempt.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions