feat: chat model factory defaults + omit strict_mode=False from payload#808
Merged
cosminacho merged 6 commits intomainfrom Apr 24, 2026
Merged
feat: chat model factory defaults + omit strict_mode=False from payload#808cosminacho merged 6 commits intomainfrom
cosminacho merged 6 commits intomainfrom
Conversation
…del factory Restores the historical max_tokens=1000 default from UiPathRequestMixin and adds temperature=0.0 and max_retries=3 defaults so callers no longer get underlying-library defaults (which can be wildly inconsistent across vendors, e.g. Gemini running out of reasoning tokens on small prompts). Also fixes a dead-code fallback in the legacy path that mapped an unset max_tokens to 0 instead of a usable default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per review feedback: max_tokens=1000 was too prescriptive — leave it unset by default so the underlying client picks the right limit per model. temperature=0.0 and max_retries=3 defaults retained. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ionut-mihalache-uipath
approved these changes
Apr 24, 2026
…ough Default goes back to 1000 (the historical UiPathRequestMixin value); the type annotation already allows None, so callers can opt out of the limit explicitly. Also drop the legacy-path None->0 remap so an explicit None is forwarded to the legacy leaf clients (which accept None). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The temperature default is now 0.0 so _UNSET is unreachable, and the legacy leaf creators already skip the temperature kwarg when it's None (see _legacy/chat_model_factory.py:54). Forwarding directly preserves caller intent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three handlers (OpenAI, Anthropic, Fireworks) used 'is not None' to gate the strict kwarg, which means strict_mode=False (the AgentConfig default) was being forwarded as 'strict=False' to the API. Switch to 'is True' so only an explicit opt-in is sent — matches BedrockConverse and Gemini which already behave this way. Updated the two tests that codified the old behavior. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Followup to forwarding None through to the legacy dispatch: update get_chat_model and the _create_*_llm helpers to accept temperature: float | None and max_tokens: int | None. The implementations already handled None correctly via 'if temperature is not None' guards in the leaf creators and direct passthrough for max_tokens (whose underlying chat models all accept None). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ionut-mihalache-uipath
approved these changes
Apr 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related fixes to keep the request payload sent to the LLM gateway predictable and free of accidental defaults.
1. Explicit defaults in
get_chat_modelStop falling through to whatever the vendor library happens to default to (which varies wildly — Gemini ran out of reasoning tokens on a 3-word prompt because
max_tokenswasn't forwarded). Restore the historical knobs fromUiPathRequestMixin:max_tokens_UNSET(not forwarded)1000(historical default from634d9fe feat: add chat models)Noneto forward an explicit unsettemperature_UNSET(not forwarded)0.0Noneto omitmax_retries_UNSET(not forwarded)3Also cleaned up a dead-code fallback in the legacy dispatch path that mapped
None/_UNSETback to0formax_tokensand0.0fortemperature— these silently overrode an explicit "unset" caller intent. NowNoneis forwarded straight through to the legacy leaf clients (which already handle it correctly viaif temperature is not None: ...guards).2. Don't send
strict_mode=Falseto the APIThree handlers (OpenAI, Anthropic, Fireworks) used
if strict_mode is not Noneto gate thestrictkwarg. SinceAgentConfig.strict_modedefaults toFalse, every request was being sent withstrict=False— leaking an opt-in flag as if it were always set. Switched toif strict_mode is Trueso only an explicit opt-in reaches the wire. Matches the existing behavior of BedrockConverse and Gemini.Bookkeeping
pyproject.toml:0.10.4→0.10.5uv.lockrefreshedTest plan
uv run pytest tests/chat/test_chat_model_factory.py— 25 pass, 4 skipped (vertex/bedrock extras not installed locally)uv run pytest tests/chat/handlers/test_tool_binding_kwargs.py tests/chat/test_bedrock_payload_handler.py— 70 passjust lint— cleanGenerated with Claude Code