Skip to content

feat(ai-proxy): add effective_model and effective_request_for_cache helpers#13371

Closed
janiussyafiq wants to merge 2 commits into
apache:masterfrom
janiussyafiq:feat/ai-proxy-effective-helpers
Closed

feat(ai-proxy): add effective_model and effective_request_for_cache helpers#13371
janiussyafiq wants to merge 2 commits into
apache:masterfrom
janiussyafiq:feat/ai-proxy-effective-helpers

Conversation

@janiussyafiq
Copy link
Copy Markdown
Contributor

Description

Two pure helpers on top of apply_instance_overrides (introduced in #13370), both in apisix/plugins/ai-proxy/base.lua:

  • effective_model(ctx) -> string returns ai_instance.options.model when the operator forces a model on the picked instance, falling back to ctx.var.request_llm_model (the client-supplied model that detect_request_type mirrored to that var).
  • effective_request_for_cache(ctx) -> table returns the request body as it would be sent upstream: reads the parsed body via core.request.get_json_request_body_table, resolves the target protocol from ctx.ai_client_protocol against the provider's capabilities, and applies apply_instance_overrides. Pure — no HTTP, no signing, no upstream call.

A small internal resolve_target_protocol(ctx, ai_provider) mirrors the protocol-routing logic in before_proxy so callers running in access phase (before before_proxy populates ctx.ai_target_protocol) can still compute the post-override view of the body. The helper prefers ctx.ai_target_protocol when it's already set, falling back to the capability lookup (passthrough), the "passthrough" sentinel, or the converter's target — same order before_proxy uses.

Motivation: same as #13370. A planned ai-cache plugin needs to compute its cache key over the post-override effective body from its own access phase, before before_proxy makes the upstream call. Without these helpers it would have to either re-implement override application + protocol routing itself, or accept a cache key that's blind to operator-configured overrides.

Stacked on #13370

This PR is built on top of #13370 (the apply_instance_overrides refactor). The diff visible here will shrink to just the helpers + their test once that lands. Please review them together; this PR has no value without the helper it builds on.

Which issue(s) this PR fixes:

N/A — new internal API surface.

Behavior change

None for the existing ai-proxy / ai-proxy-multi request flow. The helpers are additive: the existing before_proxybuild_request path is unchanged, and the helpers are not called from any phase yet. They become useful when ai-cache (next PR series) starts calling them.

Tests

Added one block to t/plugin/ai-proxy-request-body-override.t (TEST 17). The block:

  1. Configures a route with ai-proxy + options.model + override.request_body.openai-chat.temperature, plus serverless-post-function (priority -2000, default access phase) to act as a "later peer plugin".
  2. The serverless function calls effective_model(ctx) and effective_request_for_cache(ctx) and writes their output to the error log.
  3. The test sends a real request through ai-proxy to the existing echo upstream (which returns the upstream-received body as the response content).
  4. Asserts BOTH the upstream-received body (via --- response_body) AND the helper output (via --- error_log eval) reflect the same post-override view — same model, same temperature.

This proves the helpers produce exactly what build_request would send upstream, since both are observed in the same vertical test against the same route.

Verification:

  • prove -I../test-nginx/lib -I./ t/plugin/ai-proxy-request-body-override.t — 53/53 pass (50 pre-existing + 3 new assertions in TEST 17).
  • Broader sanity: 17 ai-proxy test files (679 tests) — all pass.
  • make lint — luacheck and lj-releng both clean.

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change — N/A, internal helper surface; not user-facing.
  • I have verified that this change is backward compatible

Move the three-step instance-override application (options flat overwrite,
override.llm_options capability hook, override.request_body deep merge) out
of the inline block in ai-providers/base.lua build_request and into a new
pure helper in apisix/plugins/ai-proxy/base.lua. build_request calls the
helper at the same point the inline code lived (post-converter), so the
body sent upstream is unchanged.

extra_opts no longer carries the four override-derived fields; it passes
the picked ai_instance through and the helper reads from it directly.

Zero behavior change. Motivation: ai-cache (planned follow-up plugin)
needs to compute its cache key from the post-override effective body
without going through build_request, which performs the upstream HTTP
call, signing, and keepalive.
…elpers

Two pure helpers on top of apply_instance_overrides (introduced in the
preceding refactor), both in apisix/plugins/ai-proxy/base.lua:

- effective_model(ctx) returns ai_instance.options.model when the operator
  forces a model on the instance, falling back to ctx.var.request_llm_model
  (the client-supplied model that detect_request_type mirrors).

- effective_request_for_cache(ctx) returns the request body as it would be
  sent upstream: reads the parsed body, resolves the target protocol from
  ctx.ai_client_protocol against the provider's capabilities (so peer
  plugins running in access phase before before_proxy can still get the
  post-override view), and applies apply_instance_overrides.

A small internal resolve_target_protocol helper mirrors the routing logic
in before_proxy so callers don't have to wait for ctx.ai_target_protocol
to be populated.

These helpers exist for ai-cache (planned follow-up) to compute a cache
key over the effective body without invoking build_request (which would
make the upstream HTTP call). The signatures are pure and ctx-driven.

Test: t/plugin/ai-proxy-request-body-override.t TEST 17 drives a real
request through ai-proxy with options + override.request_body, then uses
serverless-post-function (priority -2000, runs after ai-proxy access at
1040) to invoke both helpers and log their output. Asserts both the
upstream-received body AND the helper outputs reflect the same
post-override view.
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels May 13, 2026
@janiussyafiq
Copy link
Copy Markdown
Contributor Author

Folding this work into #13370 — both changes belong together as 'PR#1 of the ai-cache series' (the override-extraction refactor plus the two ctx-aware helpers that build on it). The commit from this PR has been added on top of #13370. Closing as duplicate.

@janiussyafiq janiussyafiq deleted the feat/ai-proxy-effective-helpers branch May 13, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant