Skip to content

Particle Flow cluster validation in ECAL#50292

Open
bfonta wants to merge 64 commits into
cms-sw:masterfrom
cms-ngt-hlt:ParticleFlowECALClusterValidation
Open

Particle Flow cluster validation in ECAL#50292
bfonta wants to merge 64 commits into
cms-sw:masterfrom
cms-ngt-hlt:ParticleFlowECALClusterValidation

Conversation

@bfonta
Copy link
Copy Markdown
Contributor

@bfonta bfonta commented Mar 3, 2026

PR description:

This work introduces PF cluster validation at HLT in ECAL. It marks the first of a series of efforts to develop a central tool for PF validation @ HLT.

Content
  • Definition of low-level metrics such as efficiencies, fake, merge and split rates, and higher-level responses and resolutions, which are used to validate the reconstruction of given clustering algorithms at DQM level.
  • Proposal of new HGCAL-inspired association scores for ECAL
    • SimClusters to RecoClusters and vice-versa
    • CaloParticles to RecoClusters and vice-versa
  • Implementation of a new and efficient view for hits, fractions and energies of simulated hits within a SimCluster, for fast detector-dependent retrieval
  • Large quantity of 1D and 2D histograms, including projections, across a wide range of variables: energy, $p_T$, $\eta$, $\phi$, hit multiplicity in clusters, ...
  • Additional validation plots for digis and hits in ECAL for extra checks
  • Development of new static and interactive event displays
  • Implementation of a zero-copy view for SimCluster energies and fractions
  • Clamp the LC->SC and SC->LC associators between 0 and 1 (avoid floating point precision errors leading to scores above 1)
Connection to previous PRs
  • Our work lead to the discovery of an important bug in the CloseBy gun producer (#50241)
  • This PR exploits the templating of TICL associators to support PFClusters, on top of CaloClusters (#48995)
Instructions
  • Single electron workflow: use runTheMatrix.py -l 36136.0 -w upgrade -j0 and replace the default DoubleElectronFlatPt1p5To8_cfi by SingleElectronFlatPt2To100_cfi:
cmsDriver.py SingleElectronFlatPt2To100_cfi  -s GEN,SIM -n 10 --conditions auto:phase2_realistic_T35 --beamspot DBrealisticHLLHC --datatier GEN-SIM --eventcontent FEVTDEBUG --geometry ExtendedRun4D125 --era Phase2C22I13M9 --relval 9000,100 --fileout file:step1.root                                                                                                      
                                                                                                                                                                                        
cmsDriver.py step2  -s DIGI:pdigi_valid,L1TrackTrigger,L1,L1P2GT,DIGI2RAW,HLT:@relvalRun4 --conditions auto:phase2_realistic_T35 --datatier GEN-SIM-DIGI-RAW -n 10 --eventcontent FEVTD\
EBUGHLT --geometry ExtendedRun4D125 --era Phase2C22I13M9 --filein  file:step1.root  --fileout file:step2.root                                                                           
                                                                                                                                                                                        
cmsDriver.py step3  -s RAW2DIGI,RECO,RECOSIM,PAT,VALIDATION:@phase2Validation+@miniAODValidation,DQM:@phase2+@miniAODDQM --conditions auto:phase2_realistic_T35 --datatier GEN-SIM-RECO\
,MINIAODSIM,DQMIO -n 10 --eventcontent FEVTDEBUGHLT,MINIAODSIM,DQM --geometry ExtendedRun4D125 --era Phase2C22I13M9 --filein  file:step2.root  --fileout file:step3.root                
                                                                                                                                                                                        
cmsDriver.py step4  -s HARVESTING:@phase2Validation+@phase2+@miniAODValidation+@miniAODDQM --conditions auto:phase2_realistic_T35 --mc  --geometry ExtendedRun4D125 --scenario pp --fil\
etype DQM --era Phase2C22I13M9 -n 10  --filein file:step3_inDQM.root --fileout file:step4.root                                                                                          
                                                                                                                                                                                        
cmsDriver.py step5  -s ALCA:SiPixelCalSingleMuonLoose+SiPixelCalSingleMuonTight+TkAlMuonIsolated+TkAlMinBias+MuAlOverlaps+EcalESAlign+TkAlZMuMu+TkAlDiMuonAndVertex+HcalCalHBHEMuonProd\
ucerFilter+TkAlUpsilonMuMu+TkAlJpsiMuMu --conditions auto:phase2_realistic_T35 --datatier ALCARECO -n 10 --eventcontent ALCARECO --geometry ExtendedRun4D125 --era Phase2C22I13M9 --fil\
ein file:step3.root --fileout file:step5.root

python3 ${CMSSW_BASE}/src/Validation/RecoParticleFlow/scripts/makeHLTPFValidationPlots.py --file DQM_V0001_R000000001__Global__CMSSW_X_Y_Z__RECO.root --odir ./Plots -l 'SingleEle NoPU' --era Phase2
  • Compare multiple collections with dqm-plot
# here we are comparing two variables present in two DQM files
# the full structure plus “--require_full_sources” option are provided to speed-up the directory traversal
# the path in the “-s” options (“sources”) is interpreted as a regular expression
dqm-plot -s "DQMData/Run 1/HLT/Run summary/ParticleFlow/MatchByScore/PFClusterValidation/SimClustersEta$" -s "DQMData/Run 1/HLT/Run summary/ParticleFlow/MatchByScore/PFClusterValidation/SimClustersEn$" --require_full_sources --legend "PFlowValidation, TryRebase" -o ./ComparisonPlots/ --pdf --energy-text "SingleEle (no PU)" --legend-title "Phase 2 ECAL PF Clusters" --web --histogram DQM_file1.root DQM_file2.root
  • Event displays
# from the folder where your step2p2 file (see above) is, run:
cmsRun ${CMSSW_BASE}/src/Validation/RecoParticleFlow/test/ecalGeometryAnalyzer_cfg.py --infile step2.root

# this will produce a data.root file, which has all geometry and event info required to plot the events. The geometry information is extracted only once, since it is the same for all events.
# To plot:
python3 ${CMSSW_BASE}/src/Validation/RecoParticleFlow/scripts/showECALcrystals.py -i data.root --sample_label 'Single Electron' --era Run3 --outdir /eos/user/...
# Alternatively, to use the interactive version
python3 ${CMSSW_BASE}/src/Validation/RecoParticleFlow/scripts/showECALcrystals_interactive.py -i data.root --outdir /eos/user/...
# use --help for additional options
Presentations

PR validation:

Tested with Single and CloseBy electron guns, in Run-3 and Phase-2 conditions.


Co-authored by @elenavernazza.

@bfonta
Copy link
Copy Markdown
Contributor Author

bfonta commented Mar 3, 2026

type ngt

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Mar 3, 2026

cms-bot internal usage

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Mar 3, 2026

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50292/48335

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Mar 3, 2026

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50292/48336

Code check has found code style and quality issues which could be resolved by applying following patch(s)

Comment thread Configuration/EventContent/python/EventContent_cff.py Outdated
@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Mar 3, 2026

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50292/48341

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@bfonta bfonta force-pushed the ParticleFlowECALClusterValidation branch from fb992c8 to 2fb9d85 Compare March 3, 2026 14:21
@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50292/49192

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #50292 was updated. @Moanwar, @civanch, @cmsbuild, @ctarricone, @gabrielmscampos, @jfernan2, @kpedro88, @mandrenguyen, @mdhildreth, @nothingface0, @rseidita, @srimanob can you please check and sign again.

@bfonta
Copy link
Copy Markdown
Contributor Author

bfonta commented Apr 30, 2026

@cmsbuild, please test

@makortel I've added the old versions.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 1, 2026

-1

Failed Tests: RelVals-INPUT
Size: This PR adds an extra 32KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/52991/summary.html
COMMIT: 5d24a35
CMSSW: CMSSW_17_0_X_2026-04-30-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50292/52991/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed RelVals-INPUT

  • 2500.3212500.321_EXONANOmc150X/step2_EXONANOmc150X.log
  • 2500.33112500.3311_EXONANOdata150Xrun3/step2_EXONANOdata150Xrun3.log

Comparison Summary

Summary:

  • You potentially added 843 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 32 differences found in the comparisons
  • DQMHistoTests: Total files compared: 66
  • DQMHistoTests: Total histograms compared: 4569622
  • DQMHistoTests: Total failures: 39888
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 4529713
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 337560.564 KiB( 65 files compared)
  • DQMHistoSizes: changed ( 18434.0,... ): 15399.201 KiB HLT/ParticleFlow
  • DQMHistoSizes: changed ( 18434.0,... ): 0.004 KiB MessageLogger/Errors
  • DQMHistoSizes: changed ( 18434.0,... ): 0.004 KiB MessageLogger/Warnings
  • DQMHistoSizes: changed ( 34434.758,... ): 14788.264 KiB HLT/TiclBarrel
  • Checked 276 log files, 236 edm output root files, 66 DQM output files
  • TriggerResults: found differences in 20 / 64 workflows

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 26 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 13034.0_TTbar_14TeV+2024PU step2 max memory diff 137.8 exceeds +/- 90.0 MiB
  • Error: Workflow 17034.0_TTbar_14TeV+2025PU step2 max memory diff 120.9 exceeds +/- 90.0 MiB
  • Error: Workflow 18634.0_TTbar_14TeV+2026PU step3 max memory diff 181.5 exceeds +/- 90.0 MiB
  • Error: Workflow 18634.0_TTbar_14TeV+2026PU step2 max memory diff 127.0 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.0_TTbar_14TeV+Run4D121 step3 max memory diff 128.3 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.75_TTbar_14TeV+Run4D121_HLT75e33Timing step2 max memory diff 144.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.751_TTbar_14TeV+Run4D121_HLT75e33TimingAlpaka step2 max memory diff 142.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.752_TTbar_14TeV+Run4D121_HLT75e33TimingTiclV5 step2 max memory diff 131.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.7521_TTbar_14TeV+Run4D121_HLT75e33TimingTiclV5TrackLinkGNN step2 max memory diff 130.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.753_TTbar_14TeV+Run4D121_HLT75e33TimingLegacyTracking step2 max memory diff 144.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.754_TTbar_14TeV+Run4D121_HLT75e33TimingLegacyTrackingPatatrackQuads step2 max memory diff 144.4 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.755_TTbar_14TeV+Run4D121_HLT75e33TimingLST step2 max memory diff 144.3 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.756_TTbar_14TeV+Run4D121_HLT75e33TimingTrimmedTracking step2 max memory diff 144.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.757_TTbar_14TeV+Run4D121_HLT75e33TimingMkFitFit step2 max memory diff 144.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.758_TTbar_14TeV+Run4D121_HLT75e33TimingTiclBarrel step2 max memory diff 154.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.758_TTbar_14TeV+Run4D121_HLT75e33TimingTiclBarrel step3 max memory diff 117.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.7591_TTbar_14TeV+Run4D121_HLTPhase2WithNanoValid step2 max memory diff 93.4 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.77_TTbar_14TeV+Run4D121_NGTScouting step2 max memory diff 142.0 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.771_TTbar_14TeV+Run4D121_NGTScoutingAll step3 max memory diff 116.9 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.771_TTbar_14TeV+Run4D121_NGTScoutingAll step2 max memory diff 170.7 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.773_TTbar_14TeV+Run4D121_NGTScoutingWithNanoVal step2 max memory diff 94.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.775_TTbar_14TeV+Run4D121_NGTScoutingCAExtensionMergeT5 step2 max memory diff 144.5 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.911_TTbar_14TeV+Run4D121_DD4hep step3 max memory diff 182.8 exceeds +/- 90.0 MiB
  • Error: Workflow 34500.0_CloseByPGun_CE_H_Coarse_Scint+Run4D121 step3 max memory diff 97.8 exceeds +/- 90.0 MiB
  • Error: Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step4 max memory diff 120.0 exceeds +/- 90.0 MiB
  • Error: Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step3 max memory diff 154.5 exceeds +/- 90.0 MiB

@makortel
Copy link
Copy Markdown
Contributor

makortel commented May 1, 2026

test parameters:

To work around #50844

@makortel
Copy link
Copy Markdown
Contributor

makortel commented May 1, 2026

@cmsbuild, please test

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 1, 2026

-1

Failed Tests: RelVals-INPUT
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/52995/summary.html
COMMIT: 5d24a35
CMSSW: CMSSW_17_0_X_2026-05-01-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50292/52995/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed RelVals-INPUT

  • 2500.3212500.321_EXONANOmc150X/step2_EXONANOmc150X.log
  • 2500.33112500.3311_EXONANOdata150Xrun3/step2_EXONANOdata150Xrun3.log

Comparison Summary

Summary:

  • You potentially added 845 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 24 differences found in the comparisons
  • DQMHistoTests: Total files compared: 66
  • DQMHistoTests: Total histograms compared: 4569622
  • DQMHistoTests: Total failures: 39873
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 4529728
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 337560.564 KiB( 65 files compared)
  • DQMHistoSizes: changed ( 18434.0,... ): 15399.201 KiB HLT/ParticleFlow
  • DQMHistoSizes: changed ( 18434.0,... ): 0.004 KiB MessageLogger/Errors
  • DQMHistoSizes: changed ( 18434.0,... ): 0.004 KiB MessageLogger/Warnings
  • DQMHistoSizes: changed ( 34434.758,... ): 14788.264 KiB HLT/TiclBarrel
  • Checked 276 log files, 236 edm output root files, 66 DQM output files
  • TriggerResults: found differences in 21 / 64 workflows

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 26 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 13034.0_TTbar_14TeV+2024PU step2 max memory diff 137.7 exceeds +/- 90.0 MiB
  • Error: Workflow 17034.0_TTbar_14TeV+2025PU step2 max memory diff 120.9 exceeds +/- 90.0 MiB
  • Error: Workflow 18434.0_TTbar_14TeV+2026 step3 max memory diff 117.4 exceeds +/- 90.0 MiB
  • Error: Workflow 18634.0_TTbar_14TeV+2026PU step3 max memory diff 251.9 exceeds +/- 90.0 MiB
  • Error: Workflow 18634.0_TTbar_14TeV+2026PU step2 max memory diff 126.9 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.0_TTbar_14TeV+Run4D121 step3 max memory diff 128.3 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.75_TTbar_14TeV+Run4D121_HLT75e33Timing step2 max memory diff 144.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.751_TTbar_14TeV+Run4D121_HLT75e33TimingAlpaka step2 max memory diff 142.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.752_TTbar_14TeV+Run4D121_HLT75e33TimingTiclV5 step2 max memory diff 124.0 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.7521_TTbar_14TeV+Run4D121_HLT75e33TimingTiclV5TrackLinkGNN step2 max memory diff 176.5 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.753_TTbar_14TeV+Run4D121_HLT75e33TimingLegacyTracking step2 max memory diff 144.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.754_TTbar_14TeV+Run4D121_HLT75e33TimingLegacyTrackingPatatrackQuads step2 max memory diff 144.5 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.755_TTbar_14TeV+Run4D121_HLT75e33TimingLST step2 max memory diff 144.4 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.756_TTbar_14TeV+Run4D121_HLT75e33TimingTrimmedTracking step2 max memory diff 144.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.757_TTbar_14TeV+Run4D121_HLT75e33TimingMkFitFit step2 max memory diff 144.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.758_TTbar_14TeV+Run4D121_HLT75e33TimingTiclBarrel step2 max memory diff 154.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.758_TTbar_14TeV+Run4D121_HLT75e33TimingTiclBarrel step3 max memory diff 117.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.7591_TTbar_14TeV+Run4D121_HLTPhase2WithNanoValid step2 max memory diff 93.4 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.77_TTbar_14TeV+Run4D121_NGTScouting step2 max memory diff 142.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.771_TTbar_14TeV+Run4D121_NGTScoutingAll step3 max memory diff 116.9 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.771_TTbar_14TeV+Run4D121_NGTScoutingAll step2 max memory diff 152.2 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.773_TTbar_14TeV+Run4D121_NGTScoutingWithNanoVal step2 max memory diff 94.3 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.775_TTbar_14TeV+Run4D121_NGTScoutingCAExtensionMergeT5 step2 max memory diff 144.6 exceeds +/- 90.0 MiB
  • Error: Workflow 34500.0_CloseByPGun_CE_H_Coarse_Scint+Run4D121 step3 max memory diff 107.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step4 max memory diff 256.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step3 max memory diff 154.6 exceeds +/- 90.0 MiB

@makortel
Copy link
Copy Markdown
Contributor

makortel commented May 1, 2026

test parameters:

@makortel
Copy link
Copy Markdown
Contributor

makortel commented May 1, 2026

@cmsbuild, please test

(sigh)

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 1, 2026

-1

Failed Tests: RelVals
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/52998/summary.html
COMMIT: 5d24a35
CMSSW: CMSSW_17_0_X_2026-05-01-1100/el8_amd64_gcc13
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50292/52998/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/52998/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/52998/git-merge-result

Failed RelVals

  • 135.4135.4_ZEEFS_13/step1_ZEEFS_13.log
  • 1000.01000.0_RunMinBias2011A/step2_RunMinBias2011A.log
  • 2025.00000022025.0000002_RunZeroBias2025B_10k/step2_RunZeroBias2025B_10k.log
Expand to see more relval errors ...

@makortel
Copy link
Copy Markdown
Contributor

makortel commented May 1, 2026

Ok... The failures are now

We have determined that this is simulation (if not, rerun cmsDriver.py with --data)
Traceback (most recent call last):
  File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02939/el8_amd64_gcc13/cms/cmssw/CMSSW_17_0_X_2026-04-30-2300/bin/el8_amd64_gcc13/cmsDriver.py", line 40, in <module>
    run()
  File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02939/el8_amd64_gcc13/cms/cmssw/CMSSW_17_0_X_2026-04-30-2300/bin/el8_amd64_gcc13/cmsDriver.py", line 11, in run
    options = OptionsFromCommandLine()
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02939/el8_amd64_gcc13/cms/cmssw/CMSSW_17_0_X_2026-04-30-2300/src/Configuration/Applications/python/cmsDriverOptions.py", line 37, in OptionsFromCommandLine
    options=OptionsFromItems(sys.argv[1:])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02939/el8_amd64_gcc13/cms/cmssw/CMSSW_17_0_X_2026-04-30-2300/src/Configuration/Applications/python/cmsDriverOptions.py", line 259, in OptionsFromItems
    raise Exception("--maxmem_profile and --prefix are incompatible")
Exception: --maxmem_profile and --prefix are incompatible

These seem to be related to cms-sw/cms-bot#2733

@makortel
Copy link
Copy Markdown
Contributor

makortel commented May 1, 2026

test parameters:

@makortel
Copy link
Copy Markdown
Contributor

makortel commented May 1, 2026

@cmsbuild, please test

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 1, 2026

+1

Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/53001/summary.html
COMMIT: 5d24a35
CMSSW: CMSSW_17_0_X_2026-05-01-1600/el8_amd64_gcc13
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50292/53001/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
1000.0 step 2
1001.0 step 2
101.0 step 1
10224.0 step 1
11634.0 step 1
12434.0 step 1
12834.0 step 1
12846.0 step 1
13034.0 step 1
1306.0 step 1
13234.0 step 1
1330.0 step 1
135.4 step 1
136.731 step 2
136.793 step 2
136.874 step 2
139.001 step 2
140.56 step 2
14034.0 step 1
14234.0 step 1
16834.0 step 1
17034.0 step 1
18434.0 step 1
18634.0 step 1
2022.0010001 step 2
2023.0020001 step 2
2024.0000001 step 2
2024.0010001 step 2
2024.0020001 step 2
2024.0030001 step 2
2024.0040001 step 2
2024.0050001 step 2
2024.0060001 step 2
2024.0070001 step 2
2025.0000002 step 2
2025.0010001 step 2
25.0 step 1
2500.3001 step 2
250202.181 step 1
25202.0 step 1
312.0 step 1
34434.0 step 1
34434.75 step 1
34434.911 step 1
34496.0 step 1
34500.0 step 1
34634.999 step 1
4.22 step 2
4.53 step 2
5.1 step 1
7.3 step 1
8.0 step 1
9.0 step 1
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • You potentially added 21772 lines to the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 1
  • DQMHistoTests: Total histograms compared: 0
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 0
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0 KiB( 0 files compared)
  • Checked 74 log files, 0 edm output root files, 1 DQM output files

@jfernan2
Copy link
Copy Markdown
Contributor

jfernan2 commented May 4, 2026

please test
To recover workflows for which there were errors in the baseline

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 4, 2026

-1

Failed Tests: RelVals-INPUT
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/53040/summary.html
COMMIT: 5d24a35
CMSSW: CMSSW_17_0_X_2026-05-03-2300/el8_amd64_gcc13
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50292/53040/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/53040/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/53040/git-merge-result

Failed RelVals-INPUT

  • 2500.4301DAS Error
  • 2023.00200022023.0020002_RunJetMET02023D_10k/step1_dasquery.log
  • 2023.00200012023.0020001_RunJetMET02023D_10k/step1_dasquery.log
Expand to see more relval errors ...

Comparison Summary

Summary:

  • You potentially added 206 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 31 differences found in the comparisons
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4187168
  • DQMHistoTests: Total failures: 15180
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 4171967
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 123193.624 KiB( 52 files compared)
  • DQMHistoSizes: changed ( 34434.75,... ): 15399.201 KiB HLT/ParticleFlow
  • DQMHistoSizes: changed ( 18434.0,... ): 0.004 KiB MessageLogger/Errors
  • DQMHistoSizes: changed ( 18434.0,... ): 0.004 KiB MessageLogger/Warnings
  • Checked 227 log files, 197 edm output root files, 53 DQM output files
  • TriggerResults: found differences in 8 / 51 workflows

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 40 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 4.53_RunPhoton2012B step3 max memory diff -201.2 exceeds +/- 90.0 MiB
  • Error: Workflow 9.0_Higgs200ChargedTaus step3 max memory diff -126.9 exceeds +/- 90.0 MiB
  • Error: Workflow 25.0_TTbar step3 max memory diff -126.8 exceeds +/- 90.0 MiB
  • Error: Workflow 135.4_ZEEFS_13 step3 max memory diff -224.1 exceeds +/- 90.0 MiB
  • Error: Workflow 136.731_RunSinglePh2016B step3 max memory diff -135.5 exceeds +/- 90.0 MiB
  • Error: Workflow 136.793_RunDoubleEG2017C step3 max memory diff -195.2 exceeds +/- 90.0 MiB
  • Error: Workflow 136.874_RunEGamma2018C step3 max memory diff -201.4 exceeds +/- 90.0 MiB
  • Error: Workflow 139.001_RunMinimumBias2021 step3 max memory diff -187.9 exceeds +/- 90.0 MiB
  • Error: Workflow 1306.0_SingleMuPt1_UP15 step3 max memory diff -174.3 exceeds +/- 90.0 MiB
  • Error: Workflow 1330.0_ZMM_13 step3 max memory diff -157.8 exceeds +/- 90.0 MiB
  • Error: Workflow 2022.0010001_RunTau2022D_10k step3 max memory diff -172.3 exceeds +/- 90.0 MiB
  • Error: Workflow 2023.0020001_RunJetMET02023D_10k step3 max memory diff -164.1 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0000001_RunZeroBias2024B_10k step3 max memory diff -258.0 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0010001_RunJetMET02024C_10k step3 max memory diff -172.3 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0020001_RunEGamma02024D_10k step3 max memory diff -242.5 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0030001_RunDisplacedJet2024E_10k step3 max memory diff -220.8 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0040001_RunPark2MuonLowMass02024F_10k step3 max memory diff -177.8 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0050001_RunBTagMu2024G_10k step3 max memory diff -231.1 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0060001_RunMuon02024H_10k step3 max memory diff -149.3 exceeds +/- 90.0 MiB
  • Error: Workflow 2024.0070001_RunTau2024I_10k step3 max memory diff -238.3 exceeds +/- 90.0 MiB
  • Error: Workflow 2025.0010001_RunJetMET02025C_10k step3 max memory diff -243.9 exceeds +/- 90.0 MiB
  • Error: Workflow 2025.0010001_RunJetMET02025C_10k step2 max memory diff -100.4 exceeds +/- 90.0 MiB
  • Error: Workflow 10224.0_TTbar_13+2017PU step3 max memory diff -141.2 exceeds +/- 90.0 MiB
  • Error: Workflow 11634.0_TTbar_14TeV+2022 step3 max memory diff -232.9 exceeds +/- 90.0 MiB
  • Error: Workflow 12434.0_TTbar_14TeV+2023 step3 max memory diff -179.3 exceeds +/- 90.0 MiB
  • Error: Workflow 12834.0_TTbar_14TeV+2024 step3 max memory diff -247.4 exceeds +/- 90.0 MiB
  • Error: Workflow 12846.0_ZEE_14+2024 step3 max memory diff -256.6 exceeds +/- 90.0 MiB
  • Error: Workflow 13034.0_TTbar_14TeV+2024PU step2 max memory diff 135.6 exceeds +/- 90.0 MiB
  • Error: Workflow 13034.0_TTbar_14TeV+2024PU step3 max memory diff -239.0 exceeds +/- 90.0 MiB
  • Error: Workflow 13234.0_TTbar_14TeV+2022FS step2 max memory diff -185.7 exceeds +/- 90.0 MiB
  • Error: Workflow 14034.0_TTbar_14TeV+2023FS step2 max memory diff -172.0 exceeds +/- 90.0 MiB
  • Error: Workflow 14234.0_TTbar_14TeV+2023FSPU step2 max memory diff -186.4 exceeds +/- 90.0 MiB
  • Error: Workflow 16834.0_TTbar_14TeV+2025 step3 max memory diff -173.1 exceeds +/- 90.0 MiB
  • Error: Workflow 17034.0_TTbar_14TeV+2025PU step3 max memory diff -246.4 exceeds +/- 90.0 MiB
  • Error: Workflow 17034.0_TTbar_14TeV+2025PU step2 max memory diff 120.5 exceeds +/- 90.0 MiB
  • Error: Workflow 25202.0_TTbar_13 step3 max memory diff -203.3 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.75_TTbar_14TeV+Run4D121_HLT75e33Timing step2 max memory diff 143.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34500.0_CloseByPGun_CE_H_Coarse_Scint+Run4D121 step3 max memory diff -128.6 exceeds +/- 90.0 MiB
  • Error: Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step3 max memory diff 150.5 exceeds +/- 90.0 MiB
  • Error: Workflow 250202.181_TTbar13TeVPUppmx2018 step4 max memory diff -145.4 exceeds +/- 90.0 MiB

@makortel
Copy link
Copy Markdown
Contributor

makortel commented May 4, 2026

The ModuleAllocMonitor diff reports continue to point to hltPFClusterCaloParticleAssociationProducerECAL constructor allocating 90-160 MB. I'll take a deeper look.

I think it is not worth of holding this PR further just because of the header parsing (I haven't paid attention if the review has otherwise concluded)

@bfonta
Copy link
Copy Markdown
Contributor Author

bfonta commented May 6, 2026

With help of @yashmehra028 we've tested this PR on the new 17_0_0_pre1 release with a single pion workflow, and everything run successfully. Please feel free to review the PR, and potentially approve it.

As a sidenote, we had to manually keep *_hltParticleFlowTmp_*_HLT (Phase-2 specific, needed between the HLT and the VALIDATION steps). Please let us know if we should persist the collection.

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 7, 2026

release with a single pion workflow, and everything run successfully. Please feel free to review the PR, and potentially approve it.

Can you share the results of such validation in this thread?

As a sidenote, we had to manually keep *_hltParticleFlowTmp_*_HLT (Phase-2 specific, needed between the HLT and the VALIDATION steps). Please let us know if we should persist the collection.

If this product needs to be consumed in a "step3-like" job and is produced in the earlier step-2 HLT job, then yes it needs to be persisted in the event content of the FEVTDEBUGHLT in presence of the phase2 modifier.

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 7, 2026

@cmsbuild, please test

  • to get cleaner results

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 8, 2026

+1

Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ac5b05/53107/summary.html
COMMIT: 5d24a35
CMSSW: CMSSW_17_0_X_2026-05-06-2300/el8_amd64_gcc13
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50292/53107/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 187 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 22 differences found in the comparisons
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4187168
  • DQMHistoTests: Total failures: 15210
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 4171937
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 123193.624 KiB( 52 files compared)
  • DQMHistoSizes: changed ( 18434.0,... ): 15399.201 KiB HLT/ParticleFlow
  • DQMHistoSizes: changed ( 18434.0,... ): 0.004 KiB MessageLogger/Errors
  • DQMHistoSizes: changed ( 18434.0,... ): 0.004 KiB MessageLogger/Warnings
  • Checked 227 log files, 197 edm output root files, 53 DQM output files
  • TriggerResults: found differences in 8 / 51 workflows

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 12 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 13034.0_TTbar_14TeV+2024PU step2 max memory diff 124.1 exceeds +/- 90.0 MiB
  • Error: Workflow 17034.0_TTbar_14TeV+2025PU step2 max memory diff 160.1 exceeds +/- 90.0 MiB
  • Error: Workflow 18434.0_TTbar_14TeV+2026 step3 max memory diff 117.9 exceeds +/- 90.0 MiB
  • Error: Workflow 18634.0_TTbar_14TeV+2026PU step3 max memory diff 185.0 exceeds +/- 90.0 MiB
  • Error: Workflow 18634.0_TTbar_14TeV+2026PU step2 max memory diff 145.4 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.0_TTbar_14TeV+Run4D121 step3 max memory diff 125.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.75_TTbar_14TeV+Run4D121_HLT75e33Timing step2 max memory diff 144.4 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.911_TTbar_14TeV+Run4D121_DD4hep step3 max memory diff 119.1 exceeds +/- 90.0 MiB
  • Error: Workflow 34496.0_CloseByPGun_CE_E_Front_120um+Run4D121 step3 max memory diff 106.9 exceeds +/- 90.0 MiB
  • Error: Workflow 34500.0_CloseByPGun_CE_H_Coarse_Scint+Run4D121 step3 max memory diff 106.7 exceeds +/- 90.0 MiB
  • Error: Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step4 max memory diff 183.4 exceeds +/- 90.0 MiB
  • Error: Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step3 max memory diff 171.5 exceeds +/- 90.0 MiB

@Moanwar
Copy link
Copy Markdown
Contributor

Moanwar commented May 8, 2026

Thanks @bfonta , So I think this would need some modifications based on cms-ngt-hlt#6

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 8, 2026

So I think this would need some modifications based on cms-ngt-hlt#6

and addressing #50292 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.