CNTRLPLANE-2449: test(sig-builds][Feature:Builds]): ignore watchapi.Bookmark as they are synthetic and used only for signaling sync completion#30885
Conversation
WalkthroughIgnores synthetic Bookmark events emitted by the WatchList protocol when waiting for build-related watch events in tests, ensuring loops continue until real Build events arrive. Adds the watchapi import to detect Bookmark event types. (45 words) Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
e4b57fa to
1872080
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/extended/builds/webhook.go`:
- Around line 338-347: When reading from watch.ResultChan() in the select
branches (the case handling at both the block around event.Type checks and the
later similar block), guard against a closed channel and non-Build payloads by:
replace the blind receive with a two-value receive (event, ok :=
<-watch.ResultChan()); if !ok or event == nil then continue/wait instead of
proceeding; treat watchapi.Error events specially by continuing (skip) rather
than asserting; and replace the direct type assertion
event.Object.(*buildv1.Build) with a safe assertion (obj, ok :=
event.Object.(*buildv1.Build)); if ok is false then continue so non-Build
objects don’t cause a panic. Ensure you apply these checks in both places where
you currently check for watchapi.Bookmark so the test won’t fail on closed
channels, Error events, or unexpected object types.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 588ff0ea-660a-4f8b-bc7f-4a89ba0b0702
📒 Files selected for processing (1)
test/extended/builds/webhook.go
| case event := <-watch.ResultChan(): | ||
| // Ignore bookmark events from WatchList protocol. | ||
| // When watching with an empty ResourceVersion, the apiserver uses WatchList and sends | ||
| // bookmark events to signal sync completion. These are synthetic events (not actual | ||
| // Build objects) used for caching coordination, so we skip them and continue waiting | ||
| // for real Build events. | ||
| if event.Type == watchapi.Bookmark { | ||
| continue | ||
| } | ||
| actual := event.Object.(*buildv1.Build) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify unsafe direct type assertions after watch.ResultChan() reads.
# Expected result: occurrences in this file where event.Object is asserted as *buildv1.Build
# without guarding channel closure/object type first.
rg -nP --type=go -C3 'case\s+\w+\s*:=\s*<-\s*\w+\.ResultChan\(\)|\.\s*Object\.\(\*buildv1\.Build\)' test/extended/builds/webhook.goRepository: openshift/origin
Length of output: 1269
🏁 Script executed:
# Search for all watch.ResultChan() usages in the file
rg -n 'ResultChan\(\)' test/extended/builds/webhook.goRepository: openshift/origin
Length of output: 144
🏁 Script executed:
# Search for all event.Object type assertions or accesses
rg -n 'event\.Object' test/extended/builds/webhook.goRepository: openshift/origin
Length of output: 153
🏁 Script executed:
# Get the full context around both flagged sections to understand the watch handling
sed -n '330,410p' test/extended/builds/webhook.goRepository: openshift/origin
Length of output: 3201
Guard watch channel closure and type assertions to prevent panics and flaky test failures.
The code at lines 338–347 and 395–403 reads from watch.ResultChan() without checking channel closure and directly asserts event.Object.(*buildv1.Build) without verifying the type. This can panic if the channel closes or if the event is not a Build object (e.g., an Error event from the Kubernetes API). Both locations skip only Bookmark events but should also handle Error events and gracefully skip non-Build payloads.
Proposed fix
@@
- case event := <-watch.ResultChan():
+ case event, ok := <-watch.ResultChan():
+ if !ok {
+ t.Fatalf("build watch channel closed")
+ }
// Ignore bookmark events from WatchList protocol.
@@
if event.Type == watchapi.Bookmark {
continue
}
- actual := event.Object.(*buildv1.Build)
+ if event.Type == watchapi.Error {
+ t.Fatalf("watch returned error event: %#v", event.Object)
+ }
+ actual, ok := event.Object.(*buildv1.Build)
+ if !ok {
+ continue
+ }
@@
- case event := <-watch.ResultChan():
+ case event, ok := <-watch.ResultChan():
+ if !ok {
+ t.Fatalf("build watch channel closed")
+ }
// Ignore bookmark events from WatchList protocol.
@@
if event.Type == watchapi.Bookmark {
continue
}
- build := event.Object.(*buildv1.Build)
+ if event.Type == watchapi.Error {
+ t.Fatalf("watch returned error event: %#v", event.Object)
+ }
+ build, ok := event.Object.(*buildv1.Build)
+ if !ok {
+ continue
+ }
t.Fatalf("Unexpected build created: %#v", build)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended/builds/webhook.go` around lines 338 - 347, When reading from
watch.ResultChan() in the select branches (the case handling at both the block
around event.Type checks and the later similar block), guard against a closed
channel and non-Build payloads by: replace the blind receive with a two-value
receive (event, ok := <-watch.ResultChan()); if !ok or event == nil then
continue/wait instead of proceeding; treat watchapi.Error events specially by
continuing (skip) rather than asserting; and replace the direct type assertion
event.Object.(*buildv1.Build) with a safe assertion (obj, ok :=
event.Object.(*buildv1.Build)); if ok is false then continue so non-Build
objects don’t cause a panic. Ensure you apply these checks in both places where
you currently check for watchapi.Bookmark so the test won’t fail on closed
channels, Error events, or unexpected object types.
…thetic and used only for signaling sync completion
…api.Bookmark as they are synthetic and used only for signaling sync completion
…sed only for signaling sync completion
36e820e to
7554a41
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/extended/builds/controllers.go`:
- Around line 130-149: The timeout is created inside the loop in waitForWatch
(and similarly used in waitForWatchType), causing the deadline to be reset on
each select iteration; move to an absolute timeout by creating a single
deadline/timer before the loop (e.g., compute deadline :=
time.Now().Add(BuildControllersWatchTimeout) or create one
time.NewTimer(BuildControllersWatchTimeout)) and use that single timer (or
compute remaining time with time.Until(deadline)) inside the loop/select so
bookmark events cannot extend the overall timeout; ensure to Stop() any timer
when exiting to avoid leaks and adjust waitForWatchType's i-- logic to rely on
the absolute deadline instead of per-iteration time.After.
In `@test/extended/builds/run_policy.go`:
- Around line 85-95: The watch loops reading from buildWatch.ResultChan() must
check the receive-ok and handle non-Build event types before doing a straight
type assertion; update each loop (the receive from buildWatch.ResultChan() in
run_policy.go) to first capture the receive like evt, ok :=
<-buildWatch.ResultChan() and if !ok handle the closed channel
(break/return/fail as appropriate), then skip watchapi.Bookmark as before,
explicitly handle watchapi.Error by casting evt.Object to *metav1.Status to
surface the error, and perform a safe type assertion for *buildv1.Build (build,
isBuild := evt.Object.(*buildv1.Build)) and only access build fields when
isBuild && build != nil to avoid nil pointer panics.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8bc84b6e-15d2-4aa9-a092-b73885903573
📒 Files selected for processing (3)
test/extended/builds/controllers.gotest/extended/builds/run_policy.gotest/extended/builds/webhook.go
🚧 Files skipped from review as they are similar to previous changes (1)
- test/extended/builds/webhook.go
| func waitForWatch(t testingT, name string, w watchapi.Interface) *watchapi.Event { | ||
| select { | ||
| case e, ok := <-w.ResultChan(): | ||
| if !ok { | ||
| t.Fatalf("Channel closed waiting for watch: %s", name) | ||
| for { | ||
| select { | ||
| case e, ok := <-w.ResultChan(): | ||
| if !ok { | ||
| t.Fatalf("Channel closed waiting for watch: %s", name) | ||
| } | ||
| // Ignore bookmark events from WatchList protocol. | ||
| // When watching with an empty ResourceVersion, the apiserver uses WatchList and sends | ||
| // bookmark events to signal sync completion. These are synthetic events (not actual | ||
| // object changes) used for caching coordination, so we skip them and continue waiting | ||
| // for real events. | ||
| if e.Type == watchapi.Bookmark { | ||
| continue | ||
| } | ||
| return &e | ||
| case <-time.After(BuildControllersWatchTimeout): | ||
| t.Fatalf("Timed out waiting for watch: %s", name) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the file in question
find . -name "controllers.go" -path "*/test/extended/builds/*" | head -5Repository: openshift/origin
Length of output: 97
🏁 Script executed:
# Check if the file exists and get an overview
if [ -f "test/extended/builds/controllers.go" ]; then
wc -l "test/extended/builds/controllers.go"
else
echo "File not found at expected path"
fiRepository: openshift/origin
Length of output: 99
🏁 Script executed:
# Read the specific lines around line 130-149 and 389-404
cat -n test/extended/builds/controllers.go | sed -n '125,155p'Repository: openshift/origin
Length of output: 1333
🏁 Script executed:
# Read the second function mentioned
cat -n test/extended/builds/controllers.go | sed -n '384,410p'Repository: openshift/origin
Length of output: 1316
🏁 Script executed:
# Verify Go time.After() behavior by checking documentation/semantics
# Let's also check if there are any other timeout patterns in the file for comparison
grep -n "time.After\|time.NewTimer" test/extended/builds/controllers.goRepository: openshift/origin
Length of output: 316
🏁 Script executed:
# Check if BuildControllersWatchTimeout is defined and what value it has
grep -n "BuildControllersWatchTimeout" test/extended/builds/controllers.go | head -10Repository: openshift/origin
Length of output: 324
Move timeout outside loops to ensure absolute timeout, not per-iteration reset.
Lines 146 and 401 create time.After(BuildControllersWatchTimeout) inside select statements within loops. With frequent bookmark events, each iteration resets the timeout to its full duration; the effective timeout becomes per-iteration rather than absolute. This can cause indefinite hangs when bookmark traffic is sustained. Line 394's i-- logic in waitForWatchType compounds the issue by preventing progress while timeout keeps resetting.
Suggested fix
func waitForWatch(t testingT, name string, w watchapi.Interface) *watchapi.Event {
+ timeout := time.After(BuildControllersWatchTimeout)
for {
select {
case e, ok := <-w.ResultChan():
if !ok {
t.Fatalf("Channel closed waiting for watch: %s", name)
}
if e.Type == watchapi.Bookmark {
continue
}
return &e
- case <-time.After(BuildControllersWatchTimeout):
+ case <-timeout:
t.Fatalf("Timed out waiting for watch: %s", name)
return nil
}
}
}
@@
func waitForWatchType(t testingT, name string, w watchapi.Interface, expect watchapi.EventType) *watchapi.Event {
- tries := 10
- for i := 0; i < tries; i++ {
+ tries := 10
+ timeout := time.After(BuildControllersWatchTimeout)
+ seen := 0
+ for seen < tries {
select {
- case e := <-w.ResultChan():
+ case e, ok := <-w.ResultChan():
+ if !ok {
+ t.Fatalf("Channel closed waiting for watch: %s", name)
+ }
if e.Type == watchapi.Bookmark {
- i-- // Don't consume a try for bookmark events
continue
}
+ seen++
if e.Type != expect {
continue
}
return &e
- case <-time.After(BuildControllersWatchTimeout):
+ case <-timeout:
t.Fatalf("Timed out waiting for watch: %s", name)
return nil
}
}
t.Fatalf("Waited for a %v event with %d tries but never received one", expect, tries)
return nil
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended/builds/controllers.go` around lines 130 - 149, The timeout is
created inside the loop in waitForWatch (and similarly used in
waitForWatchType), causing the deadline to be reset on each select iteration;
move to an absolute timeout by creating a single deadline/timer before the loop
(e.g., compute deadline := time.Now().Add(BuildControllersWatchTimeout) or
create one time.NewTimer(BuildControllersWatchTimeout)) and use that single
timer (or compute remaining time with time.Until(deadline)) inside the
loop/select so bookmark events cannot extend the overall timeout; ensure to
Stop() any timer when exiting to avoid leaks and adjust waitForWatchType's i--
logic to rely on the absolute deadline instead of per-iteration time.After.
| event := <-buildWatch.ResultChan() | ||
| // Ignore bookmark events from WatchList protocol. | ||
| // When watching with an empty ResourceVersion, the apiserver uses WatchList and sends | ||
| // bookmark events to signal sync completion. These are synthetic events (not actual | ||
| // Build objects) used for caching coordination, so we skip them and continue waiting | ||
| // for real Build events. | ||
| if event.Type == watchapi.Bookmark { | ||
| continue | ||
| } | ||
| build := event.Object.(*buildv1.Build) | ||
| o.Expect(IsBuildComplete(build)).Should(o.BeFalse()) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, verify the file exists and check its general structure
wc -l test/extended/builds/run_policy.goRepository: openshift/origin
Length of output: 98
🏁 Script executed:
# Check the imports at the top of the file to verify watchapi import
head -20 test/extended/builds/run_policy.goRepository: openshift/origin
Length of output: 468
🏁 Script executed:
# Now examine the specific line ranges mentioned in the review
# Check lines 85-95 first
sed -n '80,100p' test/extended/builds/run_policy.go | cat -nRepository: openshift/origin
Length of output: 1091
🏁 Script executed:
# Check all 8 locations to confirm the pattern is repeated
# Lines 111-121
sed -n '106,126p' test/extended/builds/run_policy.go | cat -nRepository: openshift/origin
Length of output: 892
🏁 Script executed:
# Lines 182-192
sed -n '177,197p' test/extended/builds/run_policy.go | cat -nRepository: openshift/origin
Length of output: 1277
🏁 Script executed:
# Lines 262-272
sed -n '257,277p' test/extended/builds/run_policy.go | cat -nRepository: openshift/origin
Length of output: 1260
🏁 Script executed:
# Lines 322-332
sed -n '317,337p' test/extended/builds/run_policy.go | cat -nRepository: openshift/origin
Length of output: 1281
🏁 Script executed:
# Lines 410-420
sed -n '405,425p' test/extended/builds/run_policy.go | cat -nRepository: openshift/origin
Length of output: 1271
🏁 Script executed:
# Lines 478-488
sed -n '473,493p' test/extended/builds/run_policy.go | cat -nRepository: openshift/origin
Length of output: 1141
🏁 Script executed:
# Lines 509-519
sed -n '504,524p' test/extended/builds/run_policy.go | cat -nRepository: openshift/origin
Length of output: 1213
Handle closed channels and error events before casting to *buildv1.Build in watch loops.
At lines 86, 112, 183, 263, 323, 411, 479, and 510, the code receives from ResultChan() without checking the ok flag. When the channel closes, this returns a zero-value Event with a nil Object. The subsequent type assertion event.Object.(*buildv1.Build) silently succeeds (returning nil), but accessing fields like build.Status.Phase causes a nil pointer panic. Additionally, the code filters watchapi.Bookmark events but ignores watchapi.Error events; an error event's Object contains *metav1.Status, not *buildv1.Build, causing the type assertion to panic.
Suggested pattern (apply to each loop)
- event := <-buildWatch.ResultChan()
+ event, ok := <-buildWatch.ResultChan()
+ o.Expect(ok).To(o.BeTrue(), "build watch channel closed unexpectedly")
if event.Type == watchapi.Bookmark {
continue
}
- build := event.Object.(*buildv1.Build)
+ o.Expect(event.Type).NotTo(o.Equal(watchapi.Error), "received watch error event")
+ build, ok := event.Object.(*buildv1.Build)
+ o.Expect(ok).To(o.BeTrue(), "unexpected watch object type: %T", event.Object)Also applies to: 112, 183, 263, 323, 411, 479, 510
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended/builds/run_policy.go` around lines 85 - 95, The watch loops
reading from buildWatch.ResultChan() must check the receive-ok and handle
non-Build event types before doing a straight type assertion; update each loop
(the receive from buildWatch.ResultChan() in run_policy.go) to first capture the
receive like evt, ok := <-buildWatch.ResultChan() and if !ok handle the closed
channel (break/return/fail as appropriate), then skip watchapi.Bookmark as
before, explicitly handle watchapi.Error by casting evt.Object to *metav1.Status
to surface the error, and perform a safe type assertion for *buildv1.Build
(build, isBuild := evt.Object.(*buildv1.Build)) and only access build fields
when isBuild && build != nil to avoid nil pointer panics.
|
Validating the changes in openshift/openshift-apiserver#616 (assuming the tests changes actually get picked in the CI) |
|
Scheduling required tests: Scheduling tests matching the |
| select { | ||
| case <-timer.C: | ||
| // nothing should happen - test passed | ||
| break Loop |
There was a problem hiding this comment.
What is the reason of this change?
There was a problem hiding this comment.
Previously there was only a select statement so catching the bookmark event type would trigger the second event := <-watch.ResultChan() case and left the select without waiting again for checking whether there's another (this time valid) *buildv1.Build object.
|
Both CI jobs are going green now in openshift/openshift-apiserver#616 with the changes in here. |
|
/approve |
|
/lgtm |
1 similar comment
|
/lgtm |
|
/retest-required |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ardaguclu, ingvagabund, Moebasim, stbenjam The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
I think, it would be desirable to see https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/30885/pull-ci-openshift-origin-main-e2e-aws-ovn-serial-1of2/2033488781524340736 green |
|
/jira refresh |
|
@ingvagabund: This pull request references CNTRLPLANE-2449 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest-required |
|
/verified by CI |
|
@ingvagabund: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@ingvagabund: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Ignore bookmark events from WatchList protocol. Bookmarks are synthetic events sent by the apiserver to signal sync completion (not actual Build objects), so they should not be treated as unexpected builds.