feat: add upgrade-lock label to skip node drain during rolling updates#2062
feat: add upgrade-lock label to skip node drain during rolling updates#2062samuelstolicny merged 9 commits intomasterfrom
Conversation
Operators can label nodes with claudie.io/upgrade-lock to have Claudie skip the drain phase until they remove the label. Prevents data loss on StatefulSets whose replication may exceed the drain timeout.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a coordination-only Changes
Sequence Diagram(s)sequenceDiagram
participant Deleter as Deleter (kuber worker)
participant Kubectl as internal/kubectl
participant KubeAPI as Kubernetes API
participant Operator as Operator / Human
Deleter->>Kubectl: KubectlNodeHasLabel(nodeName, "claudie.io/upgrade-lock")
Kubectl->>KubeAPI: kubectl get node <nodeName> -o json
KubeAPI-->>Kubectl: node JSON (includes metadata.labels)
Kubectl-->>Deleter: bool (label present?) / error
alt label present
Deleter->>Deleter: log "node skipped (upgrade lock)"
Deleter-->>Deleter: mark locked, continue without cordon/drain/delete
else label absent
Deleter->>Deleter: proceed with cordon/drain/delete
end
Note over Operator,Deleter: Operator removes label after replication/verification
Operator->>KubeAPI: kubectl label node <node> claudie.io/upgrade-lock-
Deleter->>Kubectl: subsequent run detects no label and deletes node
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
services/kuber/internal/worker/service/internal/nodes/delete.go (1)
141-215:⚠️ Potential issue | 🔴 CriticalCritical: locked master nodes are still removed from etcd.
The
continueat Line 154 skips cordon/drain/delete for a locked master, but any unlocked master processed later in the loop callsd.deleteFromEtcd(...)at Line 207, which iterates over all ofd.masterNodes(seedeleteFromEtcdat Lines 342–362) and issuesetcdctl member removefor every entry it finds — including the ones that were supposed to be protected by the upgrade-lock.Effects:
- The locked master is evicted from etcd while still running, breaking quorum semantics and potentially corrupting the control plane — the exact data-loss class this feature is meant to prevent.
- On retry, after the operator removes the label, the node is drained normally, but in the meantime its etcd membership has already been torn down.
Suggested fix: build a filtered view of masters to act on (excluding locked ones) and pass it to
deleteFromEtcd, or short-circuitdeleteFromEtcdper-node. For example, collect locked names up front and skip them both in the drain loop and insidedeleteFromEtcd.🔧 Sketch
- var locked bool + var locked bool + lockedNames := map[string]struct{}{} @@ - } else if hasLock { + } else if hasLock { logger.Info().Msgf("node %s has upgrade-lock label, skipping drain", master.k8sName) locked = true + lockedNames[master.k8sName] = struct{}{} continue } @@ - if err := d.deleteFromEtcd(logger, kubectl); err != nil { + if err := d.deleteFromEtcd(logger, kubectl, lockedNames); err != nil {And in
deleteFromEtcd, skip entries whosek8sNameis inlockedNamesbefore issuingmember remove.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/kuber/internal/worker/service/internal/nodes/delete.go` around lines 141 - 215, The loop currently skips cordon/drain for locked masters (d.masterNodes) but still calls d.deleteFromEtcd(...) which iterates all masters; collect locked master k8sNames (e.g., when kubectl.KubectlNodeHasLabel returns true) into a set (lockedNames), skip any locked master in the main loop (continue as now) and pass lockedNames into deleteFromEtcd (or alter deleteFromEtcd to accept the set) so deleteFromEtcd ignores entries whose master.k8sName is in lockedNames; likewise ensure deleteNodesByName is only called for non-locked masters or is given the filtered list to avoid removing locked members from etcd or deleting nodes protected by upgrade-lock.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/kubectl/kubectl.go`:
- Around line 152-169: KubectlNodeHasLabel currently treats any runWithOutput
error as "label absent", which allows deletes to proceed on transient failures;
change KubectlNodeHasLabel (and its use of runWithOutput) to differentiate
definitive "node not found / label absent" from transient/unknown errors: if the
kubectl command returns empty output or an explicit "not found" indication,
return (false, nil); otherwise return a retryable sentinel error (reuse
ErrUpgradeLocked or define a new ErrCouldNotDetermineLabel) so callers in
delete.go (which already handle ErrUpgradeLocked) will defer deletion and retry
instead of proceeding. Ensure references include KubectlNodeHasLabel,
runWithOutput, and ErrUpgradeLocked (or the new sentinel) so the retry path is
triggered on transient failures.
- Around line 152-169: KubectlNodeHasLabel currently treats empty label values
as absent and attempts to query a dotted jsonpath (only escaping dots) which
fails for keys with slashes; change KubectlNodeHasLabel to call getKubeconfig as
before but run kubectl with "-o jsonpath='{.metadata.labels}'" via
runWithOutput, unmarshal the returned JSON into a map[string]string and then
check presence with a map lookup (_, ok := labels[labelKey]) to return
true/false; keep existing error handling and cleanup from getKubeconfig and
return any runWithOutput/unmarshal errors.
In `@services/kuber/internal/worker/service/internal/nodes/delete.go`:
- Around line 149-155: The error branch when calling KubectlNodeHasLabel
(checking master.k8sName) must not fall through to drain; treat any error as an
assumed locked node: set locked = true, log the error at Warn/Info that we are
treating the node as locked, and continue so the node is retried instead of
drained; apply the same change to the analogous worker loop check around lines
226–232 where KubectlNodeHasLabel is used. Ensure you only modify the error
branch (the else-if hasLock branch stays the same) and reference
KubectlNodeHasLabel, master.k8sName, logger, and the locked variable when making
the change.
- Around line 306-309: The code in deleteNodes currently returns
errors.Join(errDel, ErrUpgradeLocked) when locked, which causes the pipeline to
treat the result as a fatal error; instead, stop propagating ErrUpgradeLocked as
part of the returned error so the stage can be retried: when locked is true,
return only errDel (or nil if errDel is nil) and record the upgrade-locked
condition via a diagnostic/logging path (do not include ErrUpgradeLocked in the
returned error). Update the branch handling the locked variable in deleteNodes
so ErrUpgradeLocked is not joined into the returned error but still
logged/tracked separately.
---
Outside diff comments:
In `@services/kuber/internal/worker/service/internal/nodes/delete.go`:
- Around line 141-215: The loop currently skips cordon/drain for locked masters
(d.masterNodes) but still calls d.deleteFromEtcd(...) which iterates all
masters; collect locked master k8sNames (e.g., when kubectl.KubectlNodeHasLabel
returns true) into a set (lockedNames), skip any locked master in the main loop
(continue as now) and pass lockedNames into deleteFromEtcd (or alter
deleteFromEtcd to accept the set) so deleteFromEtcd ignores entries whose
master.k8sName is in lockedNames; likewise ensure deleteNodesByName is only
called for non-locked masters or is given the filtered list to avoid removing
locked members from etcd or deleting nodes protected by upgrade-lock.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9e07f3c4-f959-4f3d-b228-3270f232bb3e
📒 Files selected for processing (3)
docs/update/update.mdinternal/kubectl/kubectl.goservices/kuber/internal/worker/service/internal/nodes/delete.go
|
If there's only a subset of the nodes that have |
|
All nodes with label The worfklow then repeats until the labels are removed. |
Summary
Adds an operator-controlled "pause button" for Claudie rolling updates. Any node labeled with
claudie.io/upgrade-lockis skipped during the DELETE_NODES drain phase, and Claudie keeps retrying until the operator removes the label.Closes #2021.
Solution
When the operator applies the label
claudie.io/upgrade-lockto a node, Claudie's DeleteNodes loop skips cordon/drain on that node, returnsErrUpgradeLocked, and relies on the existing task-retry mechanism to reschedule DELETE_NODES until the label is removed.Why a label and not a taint
The original design used a taint, but testing on a live cluster showed that Claudie's PATCH_NODES stage uses
"replace"on/spec/taints, which wipes the entire taints array on every reconcile. The operator-applied taint never survived long enough for DELETE_NODES to see it.Labels use a per-key JSON patch (
replace /metadata/labels/<key>), so operator-applied labels are preserved across reconciles. Using a label also matches the semantics better: the upgrade-lock is a coordination signal with Claudie, not a scheduling rule. Operators who want additional scheduling protection can apply their own taints independently.Operator workflow
Testing
End-to-end validated on a live Hetzner cluster (1 control-plane + 2 workers, template tag bumped v0.10.0 -> v0.11.0):
node <name> has upgrade-lock label, skipping drainErrUpgradeLockedrescheduled the task, observed retries every ~25sTest plan
go build ./...)Summary by CodeRabbit
Documentation
New Features
Chores