ceph-perf: tighten PR regression gate#2566
Conversation
25b9570 to
18e33c0
Compare
There was a problem hiding this comment.
Pull request overview
Updates the ceph-perf pull request performance job to behave as a stricter release-style regression gate by pinning the benchmark definition to ceph-main and surfacing regression status directly on GitHub PRs.
Changes:
- Parameterize CBT workload selection in the
run-cbtmacro and pin the benchmark YAML toceph-main. - Add/refresh GitHub PR status context strings to make perf regression results visible in the PR UI.
- Re-enable the perf PR job template (
disabled: false) and clarify the job’s purpose via description.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
18e33c0 to
c517923
Compare
Matan-B
left a comment
There was a problem hiding this comment.
Few pieces still missing here. I’m not sure this would work with older compilers and exiting CBT support. I would suggest trying to run some of the tests locally first to see if the numbers make sense before we start blocking PRs with this job.
Is this check required once enabled?
We should probably also add some details to Crimson’s dev guide, how the test works, when a PR is blocked (under which cases) and where the results are stored. This should help devs mitigate failures.
| --cmake-args "-DCMAKE_CXX_COMPILER=$cxx_compiler -DCMAKE_C_COMPILER=$c_compiler -DCMAKE_BUILD_TYPE=Release -DWITH_CRIMSON=ON -DWITH_TESTS=OFF" \ | ||
| vstart-base crimson-osd | ||
| src/script/run-cbt.sh --build-dir $PWD/build --source-dir $PWD --cbt ${{WORKSPACE}}/cbt -a $archive_dir src/test/crimson/cbt/radosbench_4K_read.yaml | ||
| src/script/run-cbt.sh --build-dir $PWD/build --source-dir $PWD --cbt ${{WORKSPACE}}/cbt -a $archive_dir {benchmark-yaml} |
There was a problem hiding this comment.
run-cbt is outdated with crimson usage and still relies on cyanstore, which is no longer relevant for our purposes.
c517923 to
7359ce6
Compare
There was a problem hiding this comment.
I'm not sure I understand if this is ready for review or was tested locally.
If it was, can you please share examples of how the results comparison work?
The clang version that was already mentioned in the previous review would still FTBFS.
CBT is still not updated for the relevant usage with Crimson.
Under which cases a PR is considered blocked (dev docs)?
It is also not clear what the strategy is here - is this intended to run only against Crimson PRs?
Please make sure there is a solid plan for how this is going to be implemented. Starting with a draft PR is also a good approach, but please be clear about the draft stage (what is missing, what is not yet tested/verified and what the next steps are or what is working so far). Otherwise, this PR does not seem ready for review.
My suggestion here would be to enable this check as a non-required PR check and iterate on it until it can be re-enabled as a required one. Alternatively, running parts of the jobs locally to make sure all is working as expected.
I would be happy to help in any way if you have any questions.
| cxx_compiler=g++ | ||
| c_compiler=gcc | ||
| for i in $(seq 15 -1 10); do | ||
| for i in $(seq 18 -1 12); do |
There was a problem hiding this comment.
Crimson would FTBFS with clang less than 19
| fi | ||
| export WITH_CRIMSON=true | ||
| timeout 7200 src/script/run-make.sh \ | ||
| --cmake-args "-DCMAKE_CXX_COMPILER=$cxx_compiler -DCMAKE_C_COMPILER=$c_compiler -DCMAKE_BUILD_TYPE=Release -DWITH_CRIMSON=ON -DWITH_TESTS=OFF" \ |
There was a problem hiding this comment.
We should be able to disable more components if we would like to have a minimal build.
7359ce6 to
94866b0
Compare
This describes the perf PR job as a release regression check, pin the benchmark definition to ceph-main, and surface the perf regression status in GitHub. This also reenables the ceph-perf-pull-request Fixes: https://tracker.ceph.com/issues/75665 Signed-off-by: Kautilya Tripathi <kautilya.tripathi@ibm.com>
94866b0 to
064d528
Compare
| cmake_args="-DCMAKE_CXX_COMPILER=$cxx_compiler -DCMAKE_C_COMPILER=$c_compiler -DCMAKE_BUILD_TYPE=Release -DWITH_CRIMSON=ON -DWITH_TESTS=OFF $CEPH_PERF_EXTRA_CMAKE_ARGS" | ||
| timeout 7200 src/script/run-make.sh \ | ||
| --cmake-args "$cmake_args" \ | ||
| vstart-base |
There was a problem hiding this comment.
why did you drop crimson-osd from the build target list? please note vstart-base alone does not build crimson-osd.
| - run-cbt: | ||
| src-dir: "ceph-pr" | ||
| osd-flavor: '{osd-flavor}' | ||
| benchmark-yaml: "$WORKSPACE/ceph-pr/src/test/crimson/cbt/radosbench_4K_read.yaml" |
There was a problem hiding this comment.
i think both runs should use the same yaml file, so that it's an apple-to-apple comparison.
| timeout 7200 src/script/run-make.sh --cmake-args "-DCMAKE_BUILD_TYPE=Release -DWITH_TESTS=OFF" vstart-base | ||
| src/script/run-cbt.sh --build-dir $PWD/build --source-dir $PWD --cbt ${{WORKSPACE}}/cbt -a $archive_dir src/test/crimson/cbt/radosbench_4K_read.yaml --classical | ||
| fi | ||
| export WITH_CRIMSON=true |
There was a problem hiding this comment.
does crimson require a certain version of gcc? i am asking because, if clang-19..22 is not installed, cxx_compiler stays g++ . and IIRC, seastar is only compatible with the latest two major versions of GCC and Clang.
| case $ID in | ||
| debian|ubuntu) | ||
| sudo env DEBIAN_FRONTEND=noninteractive apt-get install -y python3-yaml python3-lxml python3-prettytable clang-12 | ||
| sudo env DEBIAN_FRONTEND=noninteractive apt-get install -y python3-yaml python3-lxml python3-prettytable |
There was a problem hiding this comment.
clang-12 is dropped, but do we need to install a updated clang here?
This describes the perf PR job as a release regression check, pin the benchmark definition to ceph-main, and surface the perf regression status in GitHub. This also reenables the ceph-perf-pull-request
Fixes: https://tracker.ceph.com/issues/75665