Initial commit: Rusty AI SDK - unified multi-provider AI framework#1
Merged
Conversation
Complete Cargo workspace with unified AI SDK architecture: Core crates: - rusty_ai: traits (LanguageModel, EmbeddingModel, Provider, Tool, Middleware), typed errors, streaming (futures::Stream), structured output, routing - rusty_middleware: retry with backoff, logging, caching, middleware chain - rusty_ui_stream: SSE + NDJSON encoders, versioned UI protocol - rusty_testing: mock models/providers, stream assertions Cloud providers: - rusty_openai_compatible: generic OpenAI-compatible API adapter - rusty_chatgpt: OpenAI ChatGPT (GPT-4o, o3-mini) - rusty_claude: Anthropic Messages API (Sonnet, Opus, Haiku) - rusty_gemini: Google Gemini API with multimodal support - rusty_ollama: local Ollama server with NDJSON streaming Local/platform runtimes (bridge-based, first-class): - rusty_gemini_nano: Android Prompt API with session support - rusty_foundationmodels: Apple Foundation Models - rusty_phi_silica: Windows NPU Phi Silica - rusty_browser: Chrome/Edge built-in AI for WASM targets All crates compile cleanly against workspace dependencies. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
Expanded mock bridge implementations and routing demonstrations. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
…d output fixes Core (rusty_ai): - Add ThinkingConfig enum (Adaptive, Budget, Enabled) and ReasoningEffort to GenerateOptions - Add ThinkingDelta and SyntheticStreamingNotice stream events - Add ExtendedThinking, VideoInput, AudioInput, AudioOutput Capability variants - SyntheticStreamer now emits SyntheticStreamingNotice before text chunks rusty_claude: - Fix ImageSource to support both base64 and URL sources (no more [image: url] fallback) - Add structured output via output_config.format (json_schema, GA 2026 API) - Add extended thinking via thinking field (adaptive mode) - Add ThinkingDelta/SignatureDelta stream parser handling - Update model IDs: claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251001 rusty_gemini: - Update models to gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite - Add ThinkingConfig (thinking_budget/thinking_level) to GenerationConfig - Add id field to FunctionCall/FunctionResponse (Gemini 3+ requirement) - Add Thought part variant for thinking token streaming - Add responseSchema/responseMimeType for structured output rusty_ollama: - Add think: Option<bool> to chat request (reasoning models: deepseek-r1, qwen3) - Add thinking field to response (streaming + non-streaming) - Pass full JSON Schema as format field for structured output (Ollama 2025+) - Emit ThinkingDelta events from NDJSON stream parser https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
…uctured output rusty_claude: - Update models: claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251001 - Add ExtendedThinking + StructuredOutput capabilities - Fix stream_parser: handle ThinkingDelta and SignatureDelta events - Fix convert: pass thinking config and output_config to API rusty_gemini: - Update models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite - Add ExtendedThinking, VideoInput, AudioInput capabilities - Add thinking_config (budget/level) to generation config - Add id field to FunctionCall/FunctionResponse (Gemini 3+ compat) - Add Thought GeminiPart variant; emit ThinkingDelta from stream parser rusty_ollama: - Pass full JSON Schema as format field for structured output - Add think flag propagation; emit ThinkingDelta from NDJSON stream rusty_chatgpt: - Add gpt-5.4, gpt-5.4-mini, gpt-5.4-nano models - Add gpt54() and gpt54_mini() convenience methods rusty_phi_silica: - Add stream_tokens() to bridge trait (maps to GenerateResponseWithUpdatesAsync) - Replace SyntheticStreamer with real chunk-based streaming in model rusty_browser: - Add BackingModel enum (GeminiNano vs PhiSilica) - Add response_constraint to BrowserAiOptions (Chrome Prompt API) - Add supports_response_constraint capability flag - Update docs: window.ai deprecated, use LanguageModel global directly rusty_ui_stream: - Handle ThinkingDelta and SyntheticStreamingNotice events in encoder https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
rusty_browser: update BrowserAiBridge doc comments noting window.ai deprecation in Chrome 138+, direct LanguageModel global usage, and Edge/Phi Silica backing distinction rusty_phi_silica: fix stream() to drive bridge.stream_tokens() directly instead of calling generate() and wrapping in SyntheticStreamer https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
- All providers now use pub const for well-known model IDs - Added Gemini 3 series: gemini-3.1-pro-preview, gemini-3-flash, gemini-3.1-flash-live-preview, gemini-embedding-2-preview - Provider trait gains fetch_models() for dynamic API discovery - GeminiProvider::list_remote_models() queries /v1beta/models - ChatGptProvider::list_remote_models() queries /v1/models - OllamaProvider already had list_models() via /api/tags https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
Core: - Add SpeechToTextModel + TextToSpeechModel traits with TranscriptionResult, AudioResult, TtsOptions types - Add ModelRegistry for caching dynamically fetched models - Provider trait gains speech_to_text_model(), text_to_speech_model(), fetch_models() methods - RouteCondition type alias fixes clippy::type_complexity - StreamEvent::SyntheticStreamingNotice for local runtime awareness Providers: - ChatGPT: add WHISPER, TTS, TTS_HD, GPT_4O_REALTIME, GPT_4O_AUDIO, GPT_4O_MINI_REALTIME consts + AudioInput/AudioOutput capabilities for voice models - All providers: rename consts to _LATEST suffix pattern with docs pointing users to fetch_models() for dynamic discovery CI: - Add .github/workflows/ci.yml with check, test, clippy (-Dwarnings), fmt, doc, and MSRV (1.75) jobs Quality: - Fix all clippy warnings across entire workspace - Run cargo fmt --all - GeminiRequestParts struct replaces complex return tuple https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
- MSRV 1.80 insufficient for transitive deps; bump to 1.85 - Fix unresolved rustdoc links in middleware.rs, provider.rs, and rusty_ui_stream lib.rs https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
Router::local_first was always returning true from its route condition, making the cloud fallback unreachable. The closure now checks whether the local model's CapabilitySet satisfies the request's needs (tool calling, structured output) and falls through to cloud when it doesn't. CacheMiddleware was keying on the prompt alone, so requests with the same prompt but different temperature, tools, output schema, or other generation options incorrectly returned the same cached result. The key now hashes all generation-affecting fields (numeric options via bit patterns, serializable types via JSON, enum variants via Debug). Tests added for both: four router routing scenarios and four cache hit/miss/TTL scenarios using MockLanguageModel. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Stable rustfmt versions differ between local (1.93) and CI runners, causing spurious fmt failures. Nightly rustfmt is the conventional choice for CI formatting checks. Also clear RUSTFLAGS for the fmt job. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
Root cause: no rustfmt.toml meant different rustfmt versions (local 1.93 vs CI stable) produced different output. Adding rustfmt.toml with edition="2021" ensures deterministic formatting regardless of toolchain version. https://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq
Four distinct silent-failure patterns fixed across all streaming providers:
1. SSE/NDJSON parse failures now terminate the stream with StreamError
instead of logging a warning and continuing as if no data was lost.
Affected: Claude (parse_sse_event), Gemini, Ollama (build_ndjson_stream),
OpenAI-compatible.
2. Malformed tool-call JSON (accumulated from streaming deltas) now
emits a StreamError/Error event rather than silently substituting
an empty arguments object {}. Affected: Claude (ContentBlockStop),
OpenAI-compatible (flush_pending_tools and inline flush).
3. Transport error source chain was being dropped (source: None) in
the Claude and Gemini byte-stream error paths. The original
reqwest::Error is now preserved via source: Some(Box::new(e)).
4. The OpenAI-compatible stream parser had an unreachable .unwrap() on
a HashMap re-query to extract a call_id that was already bound as
`_id` in the pattern match. Fixed to use the bound variable directly,
removing the underscore suppressor.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two issues fixed for every provider's non-2xx error path: 1. .unwrap_or_default() on response.text() silently produced a blank error message when the body couldn't be read (e.g. connection reset mid-response). All providers now log a warning and include the read-failure reason in the returned ProviderError message. 2. GeminiProvider::list_remote_models and ChatGptProvider::list_remote_models were discarding the HTTP status code (status: None) after checking it, preventing the retry middleware from distinguishing 429 from 500. Both now capture and forward status: Some(status_code). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sion RetryMiddleware: replace .expect() on last_error with .unwrap_or_else returning a descriptive Transport error, so no panic occurs if the loop invariant is somehow violated. LoggingMiddleware: success and error paths now both respect the configured tracing level (previously error path always used ERROR, ignoring with_level() settings). Error path now uses ?e (Debug format) to preserve the full error source chain; Display was silently dropping the underlying reqwest/IO cause. CacheMiddleware: cache_key() now returns Option<u64>. If the prompt cannot be serialized (returning None), process() bypasses the cache entirely rather than hashing an empty DefaultHasher state, which previously caused every un-serializable prompt to collide on a single constant hash bucket. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
generate_text returned AiError::Serialization when the model responded with no text (tool-calls-only response). This is a provider response characteristic, not a serialization error. Now returns ProviderError with a clear message including the provider_id. Doc comment updated to describe the failure mode. ThinkingConfig::Adaptive doc comment referenced 'claude-opus-4-6+' which is not a valid Anthropic model identifier. Replaced with a correct description: 'claude-3-7-sonnet and later'. OpenAiCompatibleModel::new() now delegates to try_new() which returns AiResult<Self>, mapping the reqwest build failure to AiError::Transport with the source chain preserved. new() wraps it with a descriptive .expect() that names the actual failure condition (TLS unavailable). Callers that need error recovery can use try_new() directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is the initial commit of the Rusty AI SDK, a comprehensive Rust framework for building AI applications with support for multiple providers (OpenAI, Anthropic Claude, Google Gemini, Ollama, and local runtimes) and unified abstractions for language models, embeddings, streaming, and tool calling.
Key Changes
Core Framework (
rusty_ai)LanguageModel,EmbeddingModel,Providertraits for pluggable implementationsPrompt,Message,ContentPart,StreamEvent,GenerateResultfor consistent API across providersAiStreamwithStreamEventenum for real-time response handlingToolDefinition,ToolCallRequest,ToolChoicefor structured function invocationThinkingConfigenum supporting Anthropic adaptive thinking, Gemini budget-based, and Ollama reasoning modesCapabilityandCapabilitySetfor runtime feature detectionProvider Implementations
rusty_openai_compatible): Generic HTTP API adapter with SSE streaming and tool-call accumulationrusty_chatgpt): Pre-configured OpenAI client with well-known model constantsrusty_claude): Anthropic Messages API with streaming, system message separation, and thinking supportrusty_gemini): Google Gemini with SSE streaming and structured outputrusty_ollama): Local Ollama server integration with chat and embedding supportUtilities
rusty_middleware): Composable request/response interceptors (logging, caching, retry)rusty_ui_stream): Frontend-friendly event format with SSE and NDJSON encodersrusty_testing): Mock models and providers with call recording for unit testsExamples
Notable Implementation Details
async_traitandtokiofutures::streamfor efficient event processing and transformationAiErrortype with provider-specific contexthttps://claude.ai/code/session_01NMtKKc9beRbzKEznEEzQZq