fix(kserve): add annotation to use HTTP logger endpoint for Raw deployments#689
Conversation
…yments When TrustyAI is deployed with KServe Raw InferenceServices, the operator injects an HTTPS logger URL. However, KServe's inference logger cannot verify the TLS certificate because *.svc.cluster.local certs are signed by OpenShift's internal Service CA, causing x509 verification failures. This commit adds an opt-in annotation `trustyai.opendatahub.io/kserve-logger-http` on the TrustyAIService CR. When set to "true", the operator injects an HTTP logger URL instead of HTTPS for KServe Raw deployments, avoiding the TLS issue for cluster-internal traffic. HTTPS remains the default behavior. ## Summary - Add `trustyai.opendatahub.io/kserve-logger-http` annotation support - When annotation="true", Raw deployments use HTTP logger URLs - HTTPS remains default (backward compatible) - No impact on Serverless or ModelMesh deployments ## Test plan - [ ] `go build ./...` passes - [ ] `make test` passes in CI - [ ] Deploy with annotation and verify HTTP logger URL - [ ] Deploy without annotation and verify HTTPS logger URL (backward compat) Fixes: RHOAIENG-38132 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis change introduces annotation-based control for KServe logger URL schemes. A new TrustyAIService CR annotation ( Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controllers/tas/inference_services_test.go`:
- Around line 23-31: Replace the fake client usage in the BeforeEach when
constructing testReconciler with an envtest-backed controller-runtime client:
start an envtest.Environment, register the scheme entries (same scheme.Scheme
used now), create a manager and use mgr.GetClient() (or client.New with
mgr.GetConfig()) as the Client for TrustyAIServiceReconciler, and assign
testCtx; ensure you stop/tear down the envtest environment in AfterEach and load
any required CRDs so the envtest API server behavior matches real controller
tests instead of using fake.NewClientBuilder().
In `@controllers/tas/inference_services.go`:
- Around line 255-256: When removing operator-managed logger config
(remove==true) patchKServe currently only compares the existing logger URL
against one computed scheme and can skip deletion if the annotation changed
(HTTP vs HTTPS); update patchKServe to compute both operator-managed variants
(HTTP and HTTPS) and treat the existing logger URL as removable if it equals
either variant (or matches the operator-managed host pattern) so removal does
not depend on which scheme was last stored; adjust logic where patchKServe is
called (the call with !useHTTPLogger) and inside patchKServe to check both
schemes when remove is true.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e1dd1156-eeae-479b-a0a1-597f28beb247
📒 Files selected for processing (3)
controllers/tas/constants.gocontrollers/tas/inference_services.gocontrollers/tas/inference_services_test.go
| BeforeEach(func() { | ||
| testReconciler = &TrustyAIServiceReconciler{ | ||
| Client: fake.NewClientBuilder().WithScheme(scheme.Scheme).Build(), | ||
| Scheme: scheme.Scheme, | ||
| EventRecorder: record.NewFakeRecorder(10), | ||
| Namespace: operatorNamespace, | ||
| } | ||
| testCtx = context.Background() | ||
| }) |
There was a problem hiding this comment.
Use envtest-backed controller tests instead of fake client.
Line 25 builds the reconciler with fake.NewClientBuilder(), which does not meet the controller test harness requirement and can miss API-server behavior differences.
As per coding guidelines, "Use Ginkgo v2 with Gomega assertions and controller-runtime envtest for all controller tests."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@controllers/tas/inference_services_test.go` around lines 23 - 31, Replace the
fake client usage in the BeforeEach when constructing testReconciler with an
envtest-backed controller-runtime client: start an envtest.Environment, register
the scheme entries (same scheme.Scheme used now), create a manager and use
mgr.GetClient() (or client.New with mgr.GetConfig()) as the Client for
TrustyAIServiceReconciler, and assign testCtx; ensure you stop/tear down the
envtest environment in AfterEach and load any required CRDs so the envtest API
server behavior matches real controller tests instead of using
fake.NewClientBuilder().
| // Use HTTPS by default, unless the HTTP logger annotation is set | ||
| err := r.patchKServe(ctx, instance, infService, namespace, crName, remove, !useHTTPLogger) |
There was a problem hiding this comment.
Line 256 introduces scheme-coupled cleanup risk for remove=true.
patchKServe removal currently only deletes when the existing logger URL equals the computed scheme URL. After this change, if the annotation value changes before cleanup, removal can be skipped and the old logger remains configured.
💡 Proposed fix (make removal accept either operator-managed scheme)
func (r *TrustyAIServiceReconciler) patchKServe(ctx context.Context, instance *trustyaiopendatahubiov1alpha1.TrustyAIService, infService kservev1beta1.InferenceService, namespace string, crName string, remove bool, useHTTPS bool) error {
@@
if remove {
- if infService.Spec.Predictor.Logger == nil || *infService.Spec.Predictor.Logger.URL != url {
+ if infService.Spec.Predictor.Logger == nil || infService.Spec.Predictor.Logger.URL == nil {
return nil // Removing, but InferenceLogger is not set or is not set to the expected URL. Do nothing.
}
+ currentURL := *infService.Spec.Predictor.Logger.URL
+ httpsURL := utils.GenerateHTTPSKServeLoggerURL(crName, namespace)
+ httpURL := utils.GenerateKServeLoggerURL(crName, namespace)
+ if currentURL != httpsURL && currentURL != httpURL {
+ return nil // Do not remove non-operator-managed logger URLs.
+ }
// Remove the InferenceLogger
infService.Spec.Predictor.Logger = nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@controllers/tas/inference_services.go` around lines 255 - 256, When removing
operator-managed logger config (remove==true) patchKServe currently only
compares the existing logger URL against one computed scheme and can skip
deletion if the annotation changed (HTTP vs HTTPS); update patchKServe to
compute both operator-managed variants (HTTP and HTTPS) and treat the existing
logger URL as removable if it equals either variant (or matches the
operator-managed host pattern) so removal does not depend on which scheme was
last stored; adjust logic where patchKServe is called (the call with
!useHTTPLogger) and inside patchKServe to check both schemes when remove is
true.
|
@ruivieira: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: m-misiura The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
When TrustyAI is deployed with KServe Raw InferenceServices, the operator injects an HTTPS logger URL. However, KServe's inference logger cannot verify the TLS certificate because *.svc.cluster.local certs are signed by OpenShift's internal Service CA, causing x509 verification failures.
This commit adds an opt-in annotation
trustyai.opendatahub.io/kserve-logger-httpon the TrustyAIService CR. When set to "true", the operator injects an HTTP logger URL instead of HTTPS for KServe Raw deployments, avoiding the TLS issue for cluster-internal traffic. HTTPS remains the default behavior.Summary
trustyai.opendatahub.io/kserve-logger-httpannotation supportTest plan
go build ./...passesmake testpasses in CIFixes: RHOAIENG-38132
Summary by CodeRabbit
New Features
Tests