feat: support graceful scale-down for AlluxioRuntime using AdvancedStatefulSet#5805
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @jakharmonika364. Thanks for your PR. I'm waiting for a fluid-cloudnative member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Code Review
This pull request introduces graceful worker scale-down for Alluxio by decommissioning workers before they are terminated by the StatefulSet controller, ensuring cached blocks can be migrated. This functionality is gated by a new AdvancedStatefulSet feature. The changes include new decommissioning operations in AlluxioFileUtils, integration into the SyncReplicas reconciliation loop, and comprehensive unit tests. Review feedback suggests improving context propagation by replacing context.TODO(), optimizing efficiency by passing existing runtime objects to avoid redundant API lookups, and refining error handling during the draining phase to prevent log noise.
| for ord := desiredReplicas; ord < currentReplicas; ord++ { | ||
| podName := fmt.Sprintf("%s-%d", workerStsName, ord) | ||
| pod := &corev1.Pod{} | ||
| if err := e.Client.Get(context.TODO(), |
There was a problem hiding this comment.
Using context.TODO() is generally discouraged in production code as it makes it harder to propagate cancellation and timeouts. Since SyncReplicas is part of a reconciliation loop that provides a context (though wrapped in cruntime.ReconcileRequestContext), it is better to use a proper context here. If the context is not directly accessible from ctx, consider passing context.Background() or refactoring to pass the context down.
| func (e *AlluxioEngine) getWorkerRPCPort() int { | ||
| runtime, err := e.getRuntime() | ||
| if err != nil { | ||
| return defaultWorkerRPCPort | ||
| } | ||
| if port, ok := runtime.Spec.Worker.Ports["rpc"]; ok && port > 0 { | ||
| return port | ||
| } | ||
| return defaultWorkerRPCPort | ||
| } |
There was a problem hiding this comment.
The getWorkerRPCPort function calls e.getRuntime() which performs a redundant API lookup. Since the runtime object is already fetched in SyncReplicas (line 89), it should be passed as an argument to drainScalingDownWorkers and then to getWorkerRPCPort to improve efficiency.
func (e *AlluxioEngine) getWorkerRPCPort(runtime *data.AlluxioRuntime) int {
if port, ok := runtime.Spec.Worker.Ports["rpc"]; ok && port > 0 {
return port
}
return defaultWorkerRPCPort
}| return fmt.Errorf("workers not yet drained; scale-in to %d replicas will resume on next reconcile", | ||
| runtime.Replicas()) |
There was a problem hiding this comment.
Returning a generic error here will cause the controller to log this as a failure every time it reconciles while waiting for workers to drain. This can lead to noisy logs. It is better to use a specific error type that the caller can recognize to log at a lower level, or if the framework supports it, return a requeue result without an error.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5805 +/- ##
==========================================
- Coverage 58.17% 58.11% -0.06%
==========================================
Files 478 480 +2
Lines 32485 32588 +103
==========================================
+ Hits 18899 18940 +41
- Misses 12042 12103 +61
- Partials 1544 1545 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…atefulSet (fluid-cloudnative#4193) Signed-off-by: Monika Jakhar <jakharmonika364@gmail.com>
8c7de5c to
b77d0d4
Compare
|



Ⅰ. Describe what this PR does
This PR implements a graceful decommissioning workflow for Alluxio workers during scale-in. It adds a new AdvancedStatefulSet feature gate that allows Fluid to leverage OpenKruise capabilities for finer pod lifecycle management. When enabled, workers are decommissioned and their cached data is migrated before the pods are terminated, ensuring cluster stability and data availability during scaling operations.
Ⅱ. Does this pull request fix one issue?
fixes #4193
Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.
Ⅳ. Describe how to verify it
Ⅴ. Special notes for reviews
The feature is currently in Alpha and disabled by default. It provides the necessary infrastructure to support selective pod deletion in later phases.