Skip to content

Improvements for das-up-to-nevents Script#50118

Merged
cmsbuild merged 1 commit into
cms-sw:masterfrom
AdrianoDee:das_n_ev_improvements_feb26
Apr 1, 2026
Merged

Improvements for das-up-to-nevents Script#50118
cmsbuild merged 1 commit into
cms-sw:masterfrom
AdrianoDee:das_n_ev_improvements_feb26

Conversation

@AdrianoDee
Copy link
Copy Markdown
Contributor

Triggered by #50101, this PR proposes a few improvements for das-up-to-nevents.py script. Such as:

  • a proper logger for debugging;
  • a single das_query method for the queries;
  • limiting outputs to 1000 results when running in Jenkins;
  • a debug flag for debugging printouts (activated also in Jenkins);
  • allowing only runs above 40 lumis, skipping the first 20 lumis to put some distance between the first event used and the very beginning of the run;

Changes to data wfs are expected since the events used will be different..

PR validation:

Data wfs (under data_highstats matrix) run.

@AdrianoDee
Copy link
Copy Markdown
Contributor Author

cc: @smuzaffar

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Feb 11, 2026

cms-bot internal usage

@AdrianoDee
Copy link
Copy Markdown
Contributor Author

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50118/48029

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @AdrianoDee for master.

It involves the following packages:

  • Configuration/PyReleaseValidation (pdmv)

@AdrianoDee, @DickyChant, @antoniovagnerini, @miquork can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @fabiocos, @makortel, @slomeo this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

cmd = f"dasgoclient"
# For cms-bot deterministic caching see cms-sw#50101
if "JENKINS_PREFIX" in os.environ:
cmd = f"{cmd} --limit=1000 -unique"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AdrianoDee , I also need to update https://github.com/cms-sw/cms-bot/blob/master/das-utils/das_client so that it can properly cache the results of queries with -unique ( such calls should end up in a different checksum). I will open a bot PR to accommodate this

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: RelVals-INPUT
Size: This PR adds an extra 24KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2b6b79/51259/summary.html
COMMIT: 0b21779
CMSSW: CMSSW_16_1_X_2026-02-10-2300/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50118/51259/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed RelVals-INPUT

  • 2023.00000012023.0000001_RunZeroBias2023B_10k/step2_RunZeroBias2023B_10k.log
  • 2023.00100012023.0010001_RunEGamma02023C_10k/step2_RunEGamma02023C_10k.log
  • 2022.00000012022.0000001_RunZeroBias2022B_10k/step2_RunZeroBias2022B_10k.log

Comparison Summary

The workflows 2025.0010001, 2025.0000001, 2024.0070001, 2024.0060001, 2024.0050001, 2024.0040001, 2024.0030001, 2024.0010001, 2024.0000001, 2023.0020001, 2022.0030001 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons

Summary:

  • You potentially added 202 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 38872 differences found in the comparisons
  • DQMHistoTests: Total files compared: 52
  • DQMHistoTests: Total histograms compared: 4031484
  • DQMHistoTests: Total failures: 105582
  • DQMHistoTests: Total nulls: 365
  • DQMHistoTests: Total successes: 3925517
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 259.815 KiB( 51 files compared)
  • DQMHistoSizes: changed ( 2022.0030001 ): 19.594 KiB Hcal/DigiRunHarvesting
  • DQMHistoSizes: changed ( 2022.0030001 ): 0.516 KiB RPC/DCSInfo
  • DQMHistoSizes: changed ( 2022.0030001 ): 0.191 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 2022.0030001 ): -0.066 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 2023.0020001 ): 57.445 KiB Hcal/DigiRunHarvesting
  • DQMHistoSizes: changed ( 2023.0020001 ): 1.512 KiB RPC/DCSInfo
  • DQMHistoSizes: changed ( 2023.0020001 ): 0.059 KiB SiStrip/MechanicalView
  • DQMHistoSizes: changed ( 2023.0020001 ): 0.027 KiB JetMET/SUSYDQM
  • DQMHistoSizes: changed ( 2024.0000001 ): 18.906 KiB Hcal/DigiRunHarvesting
  • DQMHistoSizes: changed ( 2024.0000001 ): 0.469 KiB RPC/DCSInfo
  • DQMHistoSizes: changed ( 2024.0000001 ): ...
  • Checked 222 log files, 193 edm output root files, 52 DQM output files
  • TriggerResults: no differences found

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 4 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 2023.0020001_RunJetMET02023D_10k step3 max memory diff 141.2 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0000001_RunZeroBias2024B_10k step3 max memory diff 218.5 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0030001_RunDisplacedJet2024E_10k step3 max memory diff 260.6 exceeds +/- 90.0 MiB
  • Error: Workflow 2025.0010001_RunJetMET02025C_10k step3 max memory diff 501.4 exceeds +/- 90.0 MiB

@AdrianoDee
Copy link
Copy Markdown
Contributor Author

Eh, this was expected since the new inputs are now on tape. I'll have to create a rule for this.

@cmsbuild
Copy link
Copy Markdown
Contributor

REMINDER @ftenchini, @mandrenguyen, @sextonkennedy: This PR was tested with cms-sw/cms-bot#2681, please check if they should be merged together

@AdrianoDee
Copy link
Copy Markdown
Contributor Author

please test
(let's give it a retry, the files should be on disk now)

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: RelVals RelVals-INPUT
Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2b6b79/51595/summary.html
COMMIT: 0b21779
CMSSW: CMSSW_16_1_X_2026-02-25-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50118/51595/install.sh to create a dev area with all the needed externals and cmssw changes.

DAS Queries: The DAS query tests failed, see the summary page for details.

Failed RelVals

  • 2025.00000012025.0000001_RunZeroBias2025B_10k/step2_RunZeroBias2025B_10k.log
  • 2025.0010001DAS Error

Failed RelVals-INPUT

  • 2023.0000001DAS Error
  • 2023.00100012023.0010001_RunEGamma02023C_10k/step2_RunEGamma02023C_10k.log
  • 2022.0000001DAS Error
Expand to see more relval errors ...

@AdrianoDee
Copy link
Copy Markdown
Contributor Author

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50118/48802

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #50118 was updated. @AdrianoDee, @DickyChant, @antoniovagnerini, @cmsbuild, @miquork can you please check and sign again.

@AdrianoDee
Copy link
Copy Markdown
Contributor Author

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Apr 1, 2026

+1

Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2b6b79/52375/summary.html
COMMIT: ce951c7
CMSSW: CMSSW_16_1_X_2026-03-31-2300/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50118/52375/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

The workflows 2025.0010001, 2025.0000002, 2024.0070001, 2024.0060001, 2024.0050001, 2024.0040001, 2024.0030001, 2024.0020001, 2024.0010001, 2024.0000001, 2023.0020001 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons

Summary:

  • You potentially removed 310 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 41483 differences found in the comparisons
  • DQMHistoTests: Total files compared: 52
  • DQMHistoTests: Total histograms compared: 3449834
  • DQMHistoTests: Total failures: 67
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3449747
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 40 files compared)
  • Checked 223 log files, 193 edm output root files, 52 DQM output files
  • TriggerResults: no differences found

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 9 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 2023.0020001_RunJetMET02023D_10k step3 max memory diff 338.1 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0000001_RunZeroBias2024B_10k step3 max memory diff -91.4 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0010001_RunJetMET02024C_10k step3 max memory diff 96.3 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0020001_RunEGamma02024D_10k step3 max memory diff 91.7 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0030001_RunDisplacedJet2024E_10k step3 max memory diff 184.7 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0050001_RunBTagMu2024G_10k step3 max memory diff 102.0 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0070001_RunTau2024I_10k step3 max memory diff -125.0 exceeds +/- 90.0 MiB
  • Error: Workflow 2025.0010001_RunJetMET02025C_10k step2 max memory diff 102.7 exceeds +/- 90.0 MiB
  • Error: Workflow 2025.0010001_RunJetMET02025C_10k step3 max memory diff 245.1 exceeds +/- 90.0 MiB

@AdrianoDee
Copy link
Copy Markdown
Contributor Author

@cms-sw/core-l2, the changes in memory are due to a different set of events used. I would have expected them to be negligible, but they do not seem worrisome either.

@AdrianoDee
Copy link
Copy Markdown
Contributor Author

+pdmv

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Apr 1, 2026

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @mandrenguyen, @ftenchini (and backports should be raised in the release meeting by the corresponding L2)

@makortel
Copy link
Copy Markdown
Contributor

makortel commented Apr 1, 2026

That's fine, it is what it is. The 100(?) events is still quite small sample for avoiding fluctuations for the peak memory usage.

@mandrenguyen
Copy link
Copy Markdown
Contributor

+1

@cmsbuild cmsbuild merged commit c03ece6 into cms-sw:master Apr 1, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants