Skip to content

DFBUGS-5946: Fix MaintenanceMode blocking unrelated storage backends during failover#748

Open
raaizik wants to merge 1 commit into
red-hat-storage:release-4.21from
raaizik:dfbugs-6300
Open

DFBUGS-5946: Fix MaintenanceMode blocking unrelated storage backends during failover#748
raaizik wants to merge 1 commit into
red-hat-storage:release-4.21from
raaizik:dfbugs-6300

Conversation

@raaizik
Copy link
Copy Markdown

@raaizik raaizik commented Apr 20, 2026

The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:

  • Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID)
  • Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs
  • Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.

(cherry picked from commit 25d63cd)

The drcluster_mmode controller now checks MaintenanceMode necessity per
storage backend (provisioner + targetID) rather than globally across all
failover DRPCs. This prevents a stuck failover DRPC from blocking
MaintenanceMode cleanup for unrelated storage backends, which was causing
sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all
MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing
rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:
- Renamed mmodeStillNeededByAnyFailoverDRPC() to
  mmodeStillNeededByFailoverDRPC(provisioner, targetID)
- Added drpcUsesStorageBackend() helper to check if a DRPC uses a
  specific storage backend by examining VRG ProtectedPVCs
- Updated pruneMModesActivations() to pass storage backend identifiers
  when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature
MaintenanceMode removal during active failovers) while being more granular
to avoid blocking unrelated storage backends.

Signed-off-by: raaizik <132667934+raaizik@users.noreply.github.com>
(cherry picked from commit 25d63cd)
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference jira/invalid-bug labels Apr 20, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6300](https://redhat.atlassian.net/browse/DFBUGS-6300), which is invalid:

  • expected the vulnerability to target either version "odf-4.21.3." or "openshift-odf-4.21.3.", but it targets "odf-4.21.2" instead
  • expected the bug to be in one of the following states: NEW, ASSIGNED, POST, but it is Verified instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:

  • Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID)
  • Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs
  • Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.

(cherry picked from commit 25d63cd)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 20, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: raaizik

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6300](https://redhat.atlassian.net/browse/DFBUGS-6300), which is invalid:

  • expected the vulnerability to target either version "odf-4.21.3." or "openshift-odf-4.21.3.", but it targets "odf-4.21.2" instead
  • expected the bug to be in one of the following states: NEW, ASSIGNED, POST, but it is Verified instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:

  • Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID)
  • Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs
  • Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.

DFBUGS-6300
(cherry picked from commit 25d63cd)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6300](https://redhat.atlassian.net/browse/DFBUGS-6300), which is invalid:

  • expected the vulnerability to target either version "odf-4.21.3." or "openshift-odf-4.21.3.", but it targets "odf-4.21.2" instead
  • expected the bug to be in one of the following states: NEW, ASSIGNED, POST, but it is Verified instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6300](https://redhat.atlassian.net/browse/DFBUGS-6300), which is invalid:

  • expected the vulnerability to target either version "odf-4.21.3." or "openshift-odf-4.21.3.", but it targets "odf-4.21.2" instead
  • expected the bug to be in one of the following states: NEW, ASSIGNED, POST, but it is Verified instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik raaizik changed the title DFBUGS-6300: Fix MaintenanceMode blocking unrelated storage backends during failover DFBUGS-6463: Fix MaintenanceMode blocking unrelated storage backends during failover Apr 20, 2026
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate and removed jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. labels Apr 20, 2026
@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:

  • Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID)
  • Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs
  • Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.

DFBUGS-6300
(cherry picked from commit 25d63cd)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6463](https://redhat.atlassian.net/browse/DFBUGS-6463), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik raaizik changed the title DFBUGS-6463: Fix MaintenanceMode blocking unrelated storage backends during failover DFBUGS-6466: Fix MaintenanceMode blocking unrelated storage backends during failover Apr 20, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6466](https://redhat.atlassian.net/browse/DFBUGS-6466), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:

  • Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID)
  • Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs
  • Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.

DFBUGS-6300
(cherry picked from commit 25d63cd)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6466](https://redhat.atlassian.net/browse/DFBUGS-6466), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:

  • Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID)
  • Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs
  • Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.

(cherry picked from commit 25d63cd)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6466](https://redhat.atlassian.net/browse/DFBUGS-6466), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik raaizik changed the title DFBUGS-6466: Fix MaintenanceMode blocking unrelated storage backends during failover DFBUGS-6467: Fix MaintenanceMode blocking unrelated storage backends during failover Apr 20, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:

  • Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID)
  • Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs
  • Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.

(cherry picked from commit 25d63cd)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented Apr 20, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 20, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-6467](https://redhat.atlassian.net/browse/DFBUGS-6467), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raghavendra-talur
Copy link
Copy Markdown

@raaizik please debug the check artifacts and operator deployment failure

@raaizik
Copy link
Copy Markdown
Author

raaizik commented May 5, 2026

@raaizik please debug the check artifacts and operator deployment failure

The "Deploy operator" failure is a deployment timeout - not related to the code changes. The error is:
error: timed out waiting for the condition on deployments/ramen-hub-operatorThe hub operator deployment didn't become ready in time (common transient CI issue - image pull delays, resource constraints, etc.).
My changes only touch the MaintenanceMode controller in the dr-cluster operator, not the hub operator. All local tests passed (make, lint, unit tests).

@raaizik raaizik changed the title DFBUGS-6467: Fix MaintenanceMode blocking unrelated storage backends during failover DFBUGS-5946: Fix MaintenanceMode blocking unrelated storage backends during failover May 11, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 11, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-5946](https://redhat.atlassian.net/browse/DFBUGS-5946), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

The drcluster_mmode controller now checks MaintenanceMode necessity per storage backend (provisioner + targetID) rather than globally across all failover DRPCs. This prevents a stuck failover DRPC from blocking MaintenanceMode cleanup for unrelated storage backends, which was causing sync operations to freeze for applications not involved in the failover.

Previously, mmodeStillNeededByAnyFailoverDRPC() would keep all MaintenanceModes active if any failover DRPC was incomplete. This caused damage when one failover encountered issues (e.g., missing rbd-mirror daemon), preventing unrelated applications from syncing.

The refined implementation:

  • Renamed mmodeStillNeededByAnyFailoverDRPC() to mmodeStillNeededByFailoverDRPC(provisioner, targetID)
  • Added drpcUsesStorageBackend() helper to check if a DRPC uses a specific storage backend by examining VRG ProtectedPVCs
  • Updated pruneMModesActivations() to pass storage backend identifiers when checking if MaintenanceMode is still needed

This maintains the safety guarantees from PR RamenDR#2401 (preventing premature MaintenanceMode removal during active failovers) while being more granular to avoid blocking unrelated storage backends.

(cherry picked from commit 25d63cd)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented May 11, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 11, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-5946](https://redhat.atlassian.net/browse/DFBUGS-5946), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@raaizik
Copy link
Copy Markdown
Author

raaizik commented May 11, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 11, 2026

@raaizik: This pull request references [Jira Issue DFBUGS-5946](https://redhat.atlassian.net/browse/DFBUGS-5946), which is invalid:

  • expected the bug to target the "odf-4.21.3" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants