Python -VV
Python 3.13.13 (main, May 3 2026, 21:46:01) [Clang 21.0.0 (clang-2100.0.123.102)]
Pip Freeze
mistral-common==1.11.0
mistralai==2.3.2
Reproduction Steps
Fire ~20+ concurrent chat.stream_async calls on a single shared Mistral
client against an endpoint that returns errors (e.g. 429 rate limits, or any
4xx/5xx). After enough leaked connections accumulate, subsequent calls fail with
httpx.PoolTimeout. Creating a fresh client per call works around it (the leaked
connections are reclaimed when the whole client is garbage-collected).
Expected Behavior
The httpx pool is not exhausted.
Additional Context
Summary
When a streaming request (chat.stream_async, stream_async, etc.) receives an
error HTTP status, the SDK raises an exception (or retries) without closing
the underlying streaming httpx.Response. The pooled connection stays checked
out forever. Under concurrency with a shared client, hitting enough 429s exhausts
the httpx connection pool and subsequent calls fail with httpx.PoolTimeout.
Affected versions
Confirmed identical from 2.3.2 through 2.4.5 — the Speakeasy-generated HTTP
layer is unchanged across these releases.
Root cause
For a streaming request, client.send(req, stream=True) returns a Response
that holds a connection from the pool until the body is fully consumed or
aclose() is called. Two code paths leak it on an error status:
-
src/mistralai/client/basesdk.py — do_request_async: when the response
status matches error_status_codes, it invokes the error hook and raises
errors.SDKError("Unexpected error occurred", http_res). With stream=True,
http_res is a live streaming response that is never aclose()d.
-
src/mistralai/client/utils/retries.py — retry_async /
retry_with_backoff_async: for a retryable status (429/500/502/503/504),
the inner do_request raises TemporaryError(res); retry_with_backoff_async
then sleeps and retries, dropping res (the open streaming response) without
closing it. Each retried attempt leaks one connection.
The synchronous variants (retry / retry_with_backoff, do_request) have the
same problem.
Non-streaming requests are unaffected: with stream=False, httpx reads the
full body and releases the connection back to the pool.
Suggested Solutions
Close the streaming response before raising or retrying on an error status:
- In
do_request_async / do_request: when stream=True, await http_res.aclose()
(resp. http_res.close()) before raising SDKError.
- In
retry_async / retry: close res before raising TemporaryError, or in
the retry loop (retry_with_backoff_async / retry_with_backoff) before the
next attempt.
Python -VV
Pip Freeze
Reproduction Steps
Fire ~20+ concurrent
chat.stream_asynccalls on a single sharedMistralclient against an endpoint that returns errors (e.g. 429 rate limits, or any
4xx/5xx). After enough leaked connections accumulate, subsequent calls fail with
httpx.PoolTimeout. Creating a fresh client per call works around it (the leakedconnections are reclaimed when the whole client is garbage-collected).
Expected Behavior
The httpx pool is not exhausted.
Additional Context
Summary
When a streaming request (
chat.stream_async,stream_async, etc.) receives anerror HTTP status, the SDK raises an exception (or retries) without closing
the underlying streaming
httpx.Response. The pooled connection stays checkedout forever. Under concurrency with a shared client, hitting enough 429s exhausts
the
httpxconnection pool and subsequent calls fail withhttpx.PoolTimeout.Affected versions
Confirmed identical from 2.3.2 through 2.4.5 — the Speakeasy-generated HTTP
layer is unchanged across these releases.
Root cause
For a streaming request,
client.send(req, stream=True)returns aResponsethat holds a connection from the pool until the body is fully consumed or
aclose()is called. Two code paths leak it on an error status:src/mistralai/client/basesdk.py—do_request_async: when the responsestatus matches
error_status_codes, it invokes the error hook and raiseserrors.SDKError("Unexpected error occurred", http_res). Withstream=True,http_resis a live streaming response that is neveraclose()d.src/mistralai/client/utils/retries.py—retry_async/retry_with_backoff_async: for a retryable status (429/500/502/503/504),the inner
do_requestraisesTemporaryError(res);retry_with_backoff_asyncthen sleeps and retries, dropping
res(the open streaming response) withoutclosing it. Each retried attempt leaks one connection.
The synchronous variants (
retry/retry_with_backoff,do_request) have thesame problem.
Non-streaming requests are unaffected: with
stream=False,httpxreads thefull body and releases the connection back to the pool.
Suggested Solutions
Close the streaming response before raising or retrying on an error status:
do_request_async/do_request: whenstream=True,await http_res.aclose()(resp.
http_res.close()) before raisingSDKError.retry_async/retry: closeresbefore raisingTemporaryError, or inthe retry loop (
retry_with_backoff_async/retry_with_backoff) before thenext attempt.