feat(ai-proxy): add effective_model and effective_request_for_cache helpers#13371
Closed
janiussyafiq wants to merge 2 commits into
Closed
feat(ai-proxy): add effective_model and effective_request_for_cache helpers#13371janiussyafiq wants to merge 2 commits into
janiussyafiq wants to merge 2 commits into
Conversation
Move the three-step instance-override application (options flat overwrite, override.llm_options capability hook, override.request_body deep merge) out of the inline block in ai-providers/base.lua build_request and into a new pure helper in apisix/plugins/ai-proxy/base.lua. build_request calls the helper at the same point the inline code lived (post-converter), so the body sent upstream is unchanged. extra_opts no longer carries the four override-derived fields; it passes the picked ai_instance through and the helper reads from it directly. Zero behavior change. Motivation: ai-cache (planned follow-up plugin) needs to compute its cache key from the post-override effective body without going through build_request, which performs the upstream HTTP call, signing, and keepalive.
…elpers Two pure helpers on top of apply_instance_overrides (introduced in the preceding refactor), both in apisix/plugins/ai-proxy/base.lua: - effective_model(ctx) returns ai_instance.options.model when the operator forces a model on the instance, falling back to ctx.var.request_llm_model (the client-supplied model that detect_request_type mirrors). - effective_request_for_cache(ctx) returns the request body as it would be sent upstream: reads the parsed body, resolves the target protocol from ctx.ai_client_protocol against the provider's capabilities (so peer plugins running in access phase before before_proxy can still get the post-override view), and applies apply_instance_overrides. A small internal resolve_target_protocol helper mirrors the routing logic in before_proxy so callers don't have to wait for ctx.ai_target_protocol to be populated. These helpers exist for ai-cache (planned follow-up) to compute a cache key over the effective body without invoking build_request (which would make the upstream HTTP call). The signatures are pure and ctx-driven. Test: t/plugin/ai-proxy-request-body-override.t TEST 17 drives a real request through ai-proxy with options + override.request_body, then uses serverless-post-function (priority -2000, runs after ai-proxy access at 1040) to invoke both helpers and log their output. Asserts both the upstream-received body AND the helper outputs reflect the same post-override view.
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Two pure helpers on top of
apply_instance_overrides(introduced in #13370), both inapisix/plugins/ai-proxy/base.lua:effective_model(ctx) -> stringreturnsai_instance.options.modelwhen the operator forces a model on the picked instance, falling back toctx.var.request_llm_model(the client-supplied model thatdetect_request_typemirrored to that var).effective_request_for_cache(ctx) -> tablereturns the request body as it would be sent upstream: reads the parsed body viacore.request.get_json_request_body_table, resolves the target protocol fromctx.ai_client_protocolagainst the provider's capabilities, and appliesapply_instance_overrides. Pure — no HTTP, no signing, no upstream call.A small internal
resolve_target_protocol(ctx, ai_provider)mirrors the protocol-routing logic inbefore_proxyso callers running in access phase (beforebefore_proxypopulatesctx.ai_target_protocol) can still compute the post-override view of the body. The helper prefersctx.ai_target_protocolwhen it's already set, falling back to the capability lookup (passthrough), the"passthrough"sentinel, or the converter's target — same orderbefore_proxyuses.Motivation: same as #13370. A planned
ai-cacheplugin needs to compute its cache key over the post-override effective body from its own access phase, beforebefore_proxymakes the upstream call. Without these helpers it would have to either re-implement override application + protocol routing itself, or accept a cache key that's blind to operator-configured overrides.Stacked on #13370
This PR is built on top of #13370 (the
apply_instance_overridesrefactor). The diff visible here will shrink to just the helpers + their test once that lands. Please review them together; this PR has no value without the helper it builds on.Which issue(s) this PR fixes:
N/A — new internal API surface.
Behavior change
None for the existing ai-proxy / ai-proxy-multi request flow. The helpers are additive: the existing
before_proxy→build_requestpath is unchanged, and the helpers are not called from any phase yet. They become useful whenai-cache(next PR series) starts calling them.Tests
Added one block to
t/plugin/ai-proxy-request-body-override.t(TEST 17). The block:ai-proxy+options.model+override.request_body.openai-chat.temperature, plusserverless-post-function(priority-2000, default access phase) to act as a "later peer plugin".effective_model(ctx)andeffective_request_for_cache(ctx)and writes their output to the error log.--- response_body) AND the helper output (via--- error_log eval) reflect the same post-override view — same model, same temperature.This proves the helpers produce exactly what
build_requestwould send upstream, since both are observed in the same vertical test against the same route.Verification:
prove -I../test-nginx/lib -I./ t/plugin/ai-proxy-request-body-override.t— 53/53 pass (50 pre-existing + 3 new assertions in TEST 17).make lint— luacheck and lj-releng both clean.Checklist