never let PodTemplateSpec rewrite Host Profiler seccomp init container#3135
never let PodTemplateSpec rewrite Host Profiler seccomp init container#3135theomagellan wants to merge 2 commits into
Conversation
This comment has been minimized.
This comment has been minimized.
ecd1dfa to
aed9541
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aed954126e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if apicommon.AgentContainerName(initContainer.Name) == apicommon.HostProfilerSeccompSetupContainerName { | ||
| continue |
There was a problem hiding this comment.
Apply pull policy to the profiler seccomp init
When spec.override.nodeAgent.image.pullPolicy is set, this continue skips both the image rewrite and the pull-policy assignment below, so host-profiler-seccomp-setup keeps its default pull policy while the host-profiler container is updated. In clusters that reuse a mutable profiler tag and set Always/Never via the override, the init container can copy a stale or disallowed seccomp JSON before the main profiler image is pulled with the requested policy. The skip should avoid only changing the image, while still honoring override.Image.PullPolicy for this init container.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
The Host Profiler is only supported through annotations with the image-override-config. spec.override.nodeAgent.image.pullPolicy doesn't apply to the annotation.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3135 +/- ##
==========================================
- Coverage 43.59% 43.59% -0.01%
==========================================
Files 354 354
Lines 30224 30220 -4
==========================================
- Hits 13177 13174 -3
+ Misses 16176 16175 -1
Partials 871 871
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Harness.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aed954126e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if apicommon.AgentContainerName(initContainer.Name) == apicommon.HostProfilerSeccompSetupContainerName { | ||
| continue |
There was a problem hiding this comment.
Keep the profiler seccomp image aligned with nodeAgent overrides
When Host Profiler is enabled and spec.override.nodeAgent.image points at a full replacement image, the regular container loop still rewrites the host-profiler container via OverrideAgentImage (line 77), but this new continue leaves host-profiler-seccomp-setup on the pre-override image and ManageNodeAgent has already hashed that pre-override image for LocalhostProfile. The pod then runs the profiler from the overridden image while copying/registering a seccomp profile from a different image/version, which can fail if the profile path or syscalls differ. This is not the earlier pull-policy comment; the fresh evidence is that the same patch's test expects the host-profiler container image to be overridden while this init container is not.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
spec.override.nodeAgent.image doesn't point at a full replacement image since the Host Profiler is not yet bundled in the agent image; the Host Profiler IS ONLY enabled by using its override annotation which has a higher precedence than spec.override.nodeAgent.image.
If the annotation wasn't set, we would be using the profiler's default image, which is the agent's image so it should also be correct for when the profiler is bundled in the agent's image.
This would need to be checked again when time comes, Codex is hard to follow here so I did my best to cover our bases.
There was a problem hiding this comment.
@theomagellan I wonder if this implies that host-profiler image override should be skipped similar to host profiler init container. This will resolve image based on the annotation but if host-profiler image override is present resolved image name will be wrong.
| // host-profiler-seccomp-setup copies a seccomp profile JSON baked into the profiler image, not the agent image | ||
| // until the profiler is bundled in the agent. | ||
| if apicommon.AgentContainerName(initContainer.Name) == apicommon.HostProfilerSeccompSetupContainerName { | ||
| continue |
There was a problem hiding this comment.
fyi, global settings may modify image registry and suffix
What does this PR do?
Fixes
host-profiler-seccomp-setupinit container using the wrong image.override.PodTemplateSpecrewrites all init container images tospec.override.nodeAgent.image, buthost-profiler-seccomp-setupmust use the profiler image to copy its seccomp profile, which the agent image doesn't have.The fix skips
host-profiler-seccomp-setupin the init container rewrite loop.Reverts the workaround introduced in [Host Profiler] Remove seccomp configmap and use profile baked into image #3061 (
resolveHostProfilerImagereadingspec.override.nodeAgent.image), which was trying to keep the hash and init container image consistent with each other around the wrong image.Motivation
host-profiler-seccomp-setupwas failing on nightly because it was running from the agent image and trying to copy a file that doesn't exist there.Additional Notes
Anything else we should know when reviewing?
Minimum Agent Versions
Are there minimum versions of the Datadog Agent and/or Cluster Agent required?
Describe your test plan
On test cluster using main:
spec.override.nodeAgent.imageRepeated these steps with fix included to confirm.
Checklist
bug,enhancement,refactoring,documentation,tooling, and/ordependenciesqa/skip-qalabel