Skip to content

ci(docker): retry dnf install on transient mirror failures#6003

Merged
Fedr merged 1 commit into
masterfrom
ci/dnf-retry-rockylinux
Apr 28, 2026
Merged

ci(docker): retry dnf install on transient mirror failures#6003
Fedr merged 1 commit into
masterfrom
ci/dnf-retry-rockylinux

Conversation

@Fedr
Copy link
Copy Markdown
Contributor

@Fedr Fedr commented Apr 28, 2026

Summary

Wrap every dnf -y install (and the dnf config-manager --add-repo for the gh-cli repo) in both docker/rockylinux8-vcpkgDockerfile and docker/rockylinux9-vcpkgDockerfile with the existing scripts/retry.sh helper (3 attempts, 30s backoff). Move COPY scripts/retry.sh /usr/local/bin/retry.sh to the top of each build/production stage so it is available before the first dnf install runs.

Motivation

Run 25045542370 job 73359583376 on PR #5998 failed in the production stage at rockylinux8-vcpkgDockerfile:69 with:

Error: Failed to download metadata for repo 'baseos': Yum repo downloading error:
  repodata/...-primary.xml.gz    - Cannot download, all mirrors were already tried without success
  repodata/...-filelists.xml.gz  - Cannot download, all mirrors were already tried without success
  repodata/...-updateinfo.xml.gz - Cannot download, all mirrors were already tried without success

Every Rocky Linux 8.10 aarch64 baseos mirror returned 404 for those specific repodata files (Bahnhof, FAU, Intermax, hostico, ctrliq GCP, ftp.sh.cvut.cz, vhosting-it, melbourne, uv.es, ...). It is the classic Rocky Linux "stale repomd" sync gap — repomd.xml advertises checksums that mirrors have not yet picked up, and clears after a few minutes once mirrors re-sync.

#5985 already wraps ./vcpkg install with retry, but #5985 does not cover dnf install — so this failure mode runs once and exits 1 with no retry. The same Dockerfiles also do a dnf-command(config-manager) + dnf -y install gh block in the production stage that hits cli.github.com, which is wrapped here for symmetry with the same kind of one-shot CDN flake.

Implementation

  • docker/rockylinux8-vcpkgDockerfile

    • Move COPY scripts/retry.sh /usr/local/bin/retry.sh to the top of the build stage (before the first dnf install).
    • Add COPY scripts/retry.sh /usr/local/bin/retry.sh to the production stage (it was previously not present in that stage at all).
    • Wrap each dnf -y install ... with retry.sh --. dnf clean all is local and stays unwrapped.
  • docker/rockylinux9-vcpkgDockerfile

    • Same restructuring.

The retry helper itself (scripts/retry.sh, added in #5985) is unchanged.

Wrap every `dnf -y install` (and `dnf config-manager --add-repo`) in
both `docker/rockylinux8-vcpkgDockerfile` and
`docker/rockylinux9-vcpkgDockerfile` with the existing `retry.sh`
helper (3 attempts, 30s backoff). Move the `COPY scripts/retry.sh` to
the top of each build/production stage so it is available before the
first `dnf install` runs.

Run https://github.com/MeshInspector/MeshLib/actions/runs/25045542370/job/73359583376
on PR #5998 failed in the production stage at line 69 with:

    Error: Failed to download metadata for repo 'baseos':
    repodata/...-primary.xml.gz - Cannot download, all mirrors were
    already tried without success
    repodata/...-filelists.xml.gz - Cannot download, ...
    repodata/...-updateinfo.xml.gz - Cannot download, ...

Every Rocky Linux 8.10 (aarch64) baseos mirror returned 404 for those
specific repodata files — the classic "stale repomd" sync gap that
clears after a few minutes once mirrors re-sync.

#5985 already wraps `./vcpkg install` with retry but does not cover
`dnf install`, so this failure mode hits without retry. Same Dockerfile
also has the gh-cli install RUN block that hits cli.github.com — wrapped
for the same reason.
@Fedr Fedr requested a review from oitel April 28, 2026 13:42
@Fedr Fedr merged commit 6fa88df into master Apr 28, 2026
36 checks passed
@Fedr Fedr deleted the ci/dnf-retry-rockylinux branch April 28, 2026 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants