Migrate daily.yml to self-hosted ARC runners for Daily and Weekly runs#3681
Migrate daily.yml to self-hosted ARC runners for Daily and Weekly runs#3681roshkhatri wants to merge 1 commit into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughUpdated .github/workflows/daily.yml to conditionally select Valkey runners for scheduled/manual runs and add explicit "Install build dependencies" steps with standardized ChangesConditional Runner Selection and Build Dependency Standardization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
.github/workflows/daily.yml (2)
172-210:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd the missing package/bootstrap step on these ARC jobs.
These jobs now take the self-hosted path for
schedule/workflow_dispatch, but unlike the other migrated Ubuntu jobs they never run an initialapt-get update+ toolchain/dev-package install before./configure/make. On the stock ARC runner image, that makes the migrated path fail even if the GitHub-hosted path still passes.Suggested fix
+ - name: Install build dependencies + run: | + sudo apt-get update + sudo apt-get install -y build-essential autoconf automake pkg-config libssl-dev tcl tcl-dev python3Also applies to: 1143-1185, 1302-1344
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/daily.yml around lines 172 - 210, The ARC job paths are missing a bootstrap package step so the self-hosted runner fails before building; add a new step (e.g., name: bootstrap packages) before "Install libbacktrace" / before the run that does "cd libbacktrace && ./configure && make && sudo make install" to run apt-get update and install build/dev toolchain packages (at minimum: apt-get update && apt-get install -y build-essential autoconf automake libtool pkg-config gcc g++ make python3) so ./configure/make and the later "make all-with-unit-tests" step succeed; apply the same insertion for the other ARC job blocks referenced (the ranges around the other job copies).
1454-1493:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not route the LTTng job to ARC runners—it requires host kernel module support.
This migration moves scheduled/manual LTTng runs onto ARC, but the job still installs
lttng-modules-dkms. That package requires host kernel headers, the/lib/modulesdirectory, and module-loading capabilities (modprobe,depmod) to build and load kernel modules. Standard Kubernetes runner pods do not provide these. DKMS post-install scripts will fail because containers cannot load modules into the host kernel, and even with privileged pod configuration, the job would fail without kernel headers matching the EKS host kernel version. This is an architectural incompatibility, not a configuration issue.Revert
runs-ontoubuntu-latestfor this job:Fix
- runs-on: ${{ (github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && 'valkey-x64' || 'ubuntu-latest' }} + runs-on: ubuntu-latest🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/daily.yml around lines 1454 - 1493, The job mistakenly targets ARC runners via the conditional runs-on expression; since the Build libbacktrace step installs lttng-modules-dkms (and uses modprobe/depmod/kernel headers), change the job's runs-on to a non-ARC host (set runs-on: 'ubuntu-latest') so the DKMS/module install can run on a real VM, keep the Install libbacktrace and Build libbacktrace steps as-is (they rely on lttng-modules-dkms), and remove the complex ternary expression currently assigned to runs-on to ensure this job does not run on ARC.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In @.github/workflows/daily.yml:
- Around line 172-210: The ARC job paths are missing a bootstrap package step so
the self-hosted runner fails before building; add a new step (e.g., name:
bootstrap packages) before "Install libbacktrace" / before the run that does "cd
libbacktrace && ./configure && make && sudo make install" to run apt-get update
and install build/dev toolchain packages (at minimum: apt-get update && apt-get
install -y build-essential autoconf automake libtool pkg-config gcc g++ make
python3) so ./configure/make and the later "make all-with-unit-tests" step
succeed; apply the same insertion for the other ARC job blocks referenced (the
ranges around the other job copies).
- Around line 1454-1493: The job mistakenly targets ARC runners via the
conditional runs-on expression; since the Build libbacktrace step installs
lttng-modules-dkms (and uses modprobe/depmod/kernel headers), change the job's
runs-on to a non-ARC host (set runs-on: 'ubuntu-latest') so the DKMS/module
install can run on a real VM, keep the Install libbacktrace and Build
libbacktrace steps as-is (they rely on lttng-modules-dkms), and remove the
complex ternary expression currently assigned to runs-on to ensure this job does
not run on ARC.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: e0e026da-0844-4a17-9131-76fa0ff1c6ac
📒 Files selected for processing (1)
.github/workflows/daily.yml
67554d1 to
2b2f96f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/daily.yml:
- Around line 1182-1187: The workflow runs apt-get install in the "Install
gtest" step before refreshing the package index, which can fail on fresh
runners; update the package index prior to any apt installs by adding or moving
a sudo apt-get update into the "Install gtest" step (or make a shared
preliminary step that runs sudo apt-get update) so that both the "Install gtest"
and "Install build dependencies" steps run after the update.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ec9ffaf1-4751-4943-ac10-f4f9945be7ec
📒 Files selected for processing (1)
.github/workflows/daily.yml
2b2f96f to
3651c12
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.github/workflows/daily.yml (1)
106-109: 🏗️ Heavy liftConsider extracting repeated Ubuntu bootstrap into a shared action
The same
apt-get update+ dependency install block is duplicated across many jobs. A composite action/reusable workflow would reduce drift and future maintenance overhead when package lists change.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/daily.yml around lines 106 - 109, The "Install build dependencies" step duplicates an apt-get bootstrap across jobs; extract that shell block into a reusable unit (either a composite action under .github/actions/bootstrap or a reusable workflow) and replace each job step that runs the apt-get update/install commands with a single "uses: ./.github/actions/bootstrap" (or appropriate workflow call) invocation; ensure the new composite action exposes any needed inputs (e.g., extra packages) and preserves the original behavior of the "Install build dependencies" step so all jobs call the shared action instead of repeating the shell block.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/daily.yml:
- Line 75: The runs-on expression currently sends all workflow_dispatch runs to
self-hosted labels like 'valkey-x64', which can deadlock on forks; update the
expression to only pick the self-hosted label when either event is 'schedule' or
when event is 'workflow_dispatch' AND the run is in the canonical repository
(e.g. check github.repository == 'OWNER/REPO'), otherwise fall back to
'ubuntu-latest' — specifically change the existing runs-on expression (the line
that inspects github.event_name and returns 'valkey-x64') to something like:
(github.event_name == 'schedule' || (github.event_name == 'workflow_dispatch' &&
github.repository == 'OWNER/REPO')) && 'valkey-x64' || 'ubuntu-latest', and
apply the same guarded pattern to the other migrated runs-on expressions (e.g.
'valkey-arm64', 'valkey-x64-largemem', etc.).
---
Nitpick comments:
In @.github/workflows/daily.yml:
- Around line 106-109: The "Install build dependencies" step duplicates an
apt-get bootstrap across jobs; extract that shell block into a reusable unit
(either a composite action under .github/actions/bootstrap or a reusable
workflow) and replace each job step that runs the apt-get update/install
commands with a single "uses: ./.github/actions/bootstrap" (or appropriate
workflow call) invocation; ensure the new composite action exposes any needed
inputs (e.g., extra packages) and preserves the original behavior of the
"Install build dependencies" step so all jobs call the shared action instead of
repeating the shell block.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: d36950e9-ab1e-4a76-9758-f17ba58fa05e
📒 Files selected for processing (1)
.github/workflows/daily.yml
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3681 +/- ##
============================================
+ Coverage 76.66% 76.71% +0.05%
============================================
Files 162 162
Lines 80656 80662 +6
============================================
+ Hits 61831 61876 +45
+ Misses 18825 18786 -39 🚀 New features to boost your workflow:
|
sarthakaggarwal97
left a comment
There was a problem hiding this comment.
Thanks @roshkhatri I think this PR is important to support patch and CVE releases as we often see constraints on our github runners. Left some comments on of the steps, but are probably applicable to all.
| with: | ||
| repository: ${{ inputs.use_repo || github.event.inputs.use_repo || github.repository }} | ||
| ref: ${{ inputs.use_git_ref || github.event.inputs.use_git_ref || github.ref }} | ||
| - name: Install build dependencies |
There was a problem hiding this comment.
we don't need this step for ubuntu-latest
| - name: Install build dependencies | ||
| run: | | ||
| sudo apt-get update | ||
| sudo apt-get install -y wget build-essential autoconf automake pkg-config libssl-dev tcl tcl-dev python3 |
There was a problem hiding this comment.
tcl is installed below already
| jobs: | ||
| test-ubuntu-jemalloc: | ||
| runs-on: ubuntu-latest | ||
| runs-on: ${{ (github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') && 'valkey-x64' || 'ubuntu-latest' }} |
There was a problem hiding this comment.
what is the behaviour for run-extra-tests here
There was a problem hiding this comment.
We wouldn't use self hosted runners for testing PR. I still have to make a change where it only runs for release vversions on PR from trusted members which are committers
|
@roshkhatri what is the work required to set these runners up for the valkey project. We talked offline and you had tested this in your personal account |
I would move this repo to valkey, https://github.com/roshkhatri/valkey-arc-runners and run the steps in README for setup for the valkey's AWS account |
3651c12 to
5f7bd0b
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/daily.yml:
- Around line 106-110: The YAML block under the step named "Install build
dependencies" has a misindented line after the run: | scalar (one leading space)
which breaks parsing; fix by aligning all lines inside the block scalar so each
command line is indented consistently (e.g., two spaces or the same indentation
level as the first command) under run: |; apply the same correction to every
identical run: | block in the workflow (the other occurrences of the "Install
build dependencies" run blocks) so each multi-line shell script has uniform
indentation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 4717b73b-53c1-4915-866f-a0afa42a9bf1
📒 Files selected for processing (1)
.github/workflows/daily.yml
5f7bd0b to
e8c871b
Compare
…nly) Use conditional runs-on to route jobs to self-hosted ARC runners only for scheduled (daily/weekly) and manual dispatch runs. PR-triggered runs continue using GitHub-hosted ubuntu-latest. Changes: - Conditional runs-on: schedule/dispatch -> self-hosted, PR -> ubuntu-latest - Add "Install build dependencies" step for self-hosted (minimal image) - Add -y flag to all apt-get install commands for non-interactive CI - Container/emulation jobs (alpine, rpm-distros, freebsd, s390x) always ubuntu-latest Runner pools: - valkey-x64: 4 vCPU, 16 GiB (standard builds, TLS, sanitizers, valgrind) - valkey-x64-largemem: 8 vCPU, 32 GiB (large-memory sanitizer/valgrind) - valkey-arm64: 4 vCPU, 16 GiB (native ARM64) Infra: EKS + Karpenter + ARC 0.14.1 (valkey-io/valkey-arc-runners) Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
09fbc11 to
9cbf5ca
Compare
ci: migrate daily.yml to self-hosted ARC runners (schedule/dispatch only)
Use conditional
runs-onto route CI jobs to self-hosted EKS runners for scheduled (daily/weekly) and manual dispatch runs. PR-triggered runs continue using GitHub-hostedubuntu-latest.Runner selection logic
Runner pools
valkey-x64valkey-x64-largememvalkey-arm64Jobs that always stay on
ubuntu-latesttest-freebsdtest-s390xtest-ubuntu-jemalloc-fortifycontainer: ubuntu:nobletest-rpm-distros-*test-alpine-*container: alpineInfrastructure
gha-runner-scale-set)ghcr.io/actions/actions-runner:latest(no custom image)Scaling performance (observed)