feat(google): support mixing function tools with Gemini's server-side provider tools (Google Search, etc.)#6016
Conversation
… provider tools (Google Search, etc.) on the Gemini 3 Developer API
| if part.text and not part.thought: | ||
| retryable = False | ||
| self._event_ch.send_nowait(chat_chunk) | ||
| self._event_ch.send_nowait( | ||
| llm.ChatChunk( | ||
| id=request_id, | ||
| delta=llm.ChoiceDelta(role="assistant", content=part.text), | ||
| ) | ||
| ) |
There was a problem hiding this comment.
🟡 Function call parts with co-occurring text now leak text as content
The old _parse_part method checked if part.function_call: first and returned immediately, ignoring any co-occurring part.text. The new streaming loop checks if part.text and not part.thought: independently of whether part.function_call is also set. If a Gemini response part has both function_call and text (the now-deleted test test_function_call_with_text_returns_none_content explicitly tested this with text="get_weather"), the text is emitted as assistant content to the user before the function call is emitted post-loop. This surfaces unexpected/duplicate text (e.g., the function name itself) as visible assistant output.
Old vs new behavior for a part with both function_call and text
Old: _parse_part entered the if part.function_call: branch, returned a function-call-only chunk, text was silently dropped.
New: The streaming loop emits a text content chunk (line 527-534), then after the loop the function call is also emitted (line 555-570). The consumer sees both.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
part.text and part.function_call are mutually exclusive:
Source: google/ai/generativelanguage/v1beta/content.proto (googleapis)
message Part {
oneof data {
string text = 2;
Blob inline_data = 3;
FunctionCall function_call = 4;
FunctionResponse function_response = 5;
FileData file_data = 6;
ExecutableCode executable_code = 9;
CodeExecutionResult code_execution_result = 10;
}
bool thought = 11;
bytes thought_signature = 13;
}| if function_calling_config or retrieval_config or include_server_side_tool_invocations: | ||
| extra["tool_config"] = types.ToolConfig( | ||
| function_calling_config=function_calling_config, | ||
| retrieval_config=retrieval_config, | ||
| include_server_side_tool_invocations=include_server_side_tool_invocations | ||
| or None, | ||
| ) | ||
| extra["tool_config"] = gemini_tool_choice | ||
| elif retrieval_config: | ||
| extra["tool_config"] = types.ToolConfig( | ||
| retrieval_config=retrieval_config, | ||
| ) | ||
|
|
||
| if tools_config := create_tools_config( | ||
| tool_ctx, | ||
| _only_single_type=drop_provider_tools or tool_choice == "none", | ||
| ): | ||
| extra["tools"] = tools_config |
There was a problem hiding this comment.
🚩 tool_config sent without tools in edge case (tool_choice='none' + only provider tools)
When tool_choice="none" and only provider tools exist (no function tools), function_calling_config is set to FunctionCallingConfig(mode=NONE) which is truthy, so a tool_config is included in the request. But create_tools_config returns an empty list (due to _only_single_type=True and no function tools), so no tools are set. This sends tool_config with NONE mode but no corresponding tools. The Gemini API may tolerate this (it's essentially a no-op config), and the test test_tool_choice_none_sends_no_provider_tools validates config.tools is None but doesn't check tool_config. Worth confirming the API doesn't reject this combination.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
no-op
client.models.generate_content(
model="gemini-3.5-flash",
contents="Say hello in one word.",
config=GenerateContentConfig(
tool_config=ToolConfig(
function_calling_config=FunctionCallingConfig(mode=NONE)
),
),
)
# -> "Hello" (HTTP 200)
Summary
include_server_side_tool_invocationsFixes:
Related:
Evidence
google-mixed-tools-demo.mp4