fix(e2e): strict wireserver validation — fail fast on unexpected curl exits#8580
Open
r2k1 wants to merge 1 commit into
Open
fix(e2e): strict wireserver validation — fail fast on unexpected curl exits#8580r2k1 wants to merge 1 commit into
r2k1 wants to merge 1 commit into
Conversation
… exits The previous validation retried for 1 minute, passing if curl eventually timed out (exit 28). This had two problems: 1. Silently accepted other "success-looking" exit codes (e.g. 0 = reachable) if they happened on the last poll iteration in earlier variants. 2. Retried through what is fundamentally a binary security check — any successful curl from a pod means the FORWARD DROP/REJECT rules are missing or wrong, which is a regression to surface immediately, not a transient condition to wait out. Changes: - Whitelist exit codes 28 (FORWARD DROP timeout) and 7 (FORWARD REJECT refused) as the only valid "wireserver blocked" signals. - Anything else fails loudly with full diagnostics: FORWARD chain, KUBE-FORWARD chain, iptables-save filter, and conntrack entries for the wireserver IP. - Retry the exec call only on transient kube-apiserver exec failures, never on the curl result itself — a single observation of an unexpected exit code is enough to fail the security check. This is strictly more defensive than the original (which only accepted exit 28) because it also accepts REJECT-based blocks, while failing on every other class of regression instead of swallowing them. Extracted from #8480. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR tightens the e2e security validator that ensures unprivileged pods cannot reach the Azure WireServer IP (168.63.129.16), changing it from “retry until timeout exit code appears” to “fail fast on any unexpected curl result,” while still retrying transient Kubernetes exec failures.
Changes:
- Accept curl exit codes
28(timeout/DROP) and7(connect failed/REJECT) as the only valid signals that WireServer is blocked. - Stop retrying based on curl outcomes; instead, fail immediately on any other curl exit code.
- Add richer failure diagnostics (FORWARD + KUBE-FORWARD chain, iptables-save filter excerpt, conntrack entries) when an unexpected exit code occurs.
| }, | ||
| } | ||
|
|
||
| allowedExitCodes := map[string]bool{"28": true, "7": true} |
Contributor
There was a problem hiding this comment.
This would be easier to read with a list of allowable exit codes rather than a map. The constant time lookup benefit we get below doesn't seem worth it given the length of time these tests take to run.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Tightens the e2e wireserver-block validator so it fails fast on any unexpected curl exit code from an unprivileged pod, instead of retrying for a minute and only failing if no acceptable exit code ever showed up.
Why
validateWireServerBlockedis a security check: pods must not be able to reach the wireserver IP (168.63.129.16). The previous implementation retried for 1 minute, passing the check if curl eventually returned exit 28 (timeout). Two problems with that:What
This is strictly more defensive than the original (which only accepted exit 28) because it also accepts REJECT-based blocks, while failing on every other class of regression instead of swallowing them.
Scope
e2e/validation.goonly. Test-only change, no product code touched. Extracted from #8480.Which issue(s) this PR fixes:
N/A — test-only hardening, no linked issue.