DFBUGS-5946: Fix MaintenanceMode blocking unrelated storage backends during failover#748
DFBUGS-5946: Fix MaintenanceMode blocking unrelated storage backends during failover#748raaizik wants to merge 1 commit into
Conversation
The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover. Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing. The refined implementation: - Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID) - Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs - Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends. Signed-off-by: raaizik <132667934+raaizik@users.noreply.github.com> (cherry picked from commit 25d63cd)
|
@raaizik: This pull request references [Jira Issue DFBUGS-6300](https://redhat.atlassian.net/browse/DFBUGS-6300), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: raaizik The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6300](https://redhat.atlassian.net/browse/DFBUGS-6300), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6300](https://redhat.atlassian.net/browse/DFBUGS-6300), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6300](https://redhat.atlassian.net/browse/DFBUGS-6300), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6466](https://redhat.atlassian.net/browse/DFBUGS-6466), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6466](https://redhat.atlassian.net/browse/DFBUGS-6466), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6466](https://redhat.atlassian.net/browse/DFBUGS-6466), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@raaizik please debug the |
The "Deploy operator" failure is a deployment timeout - not related to the code changes. The error is: |
|
@raaizik: This pull request references [Jira Issue DFBUGS-5946](https://redhat.atlassian.net/browse/DFBUGS-5946), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-5946](https://redhat.atlassian.net/browse/DFBUGS-5946), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@raaizik: This pull request references [Jira Issue DFBUGS-5946](https://redhat.atlassian.net/browse/DFBUGS-5946), which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.
Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.
The refined implementation:
This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.
(cherry picked from commit 25d63cd)