Skip to content

Set up HGCAL GPU vs CPU DQM#50974

Open
fiemmi wants to merge 1 commit into
cms-sw:masterfrom
fiemmi:ticl_dqm_GPUvsCPU_CMSSW_17_0_0_pre1
Open

Set up HGCAL GPU vs CPU DQM#50974
fiemmi wants to merge 1 commit into
cms-sw:masterfrom
fiemmi:ticl_dqm_GPUvsCPU_CMSSW_17_0_0_pre1

Conversation

@fiemmi
Copy link
Copy Markdown

@fiemmi fiemmi commented May 19, 2026

PR description:

This PR implements a new DQMEDAnalyzer to monitor TICL GPU and CPU reconstruction for HGCAL. It further schedules the analyzer through the alpakaValidationHLT procModifier. The output consists of TH1Fs and TH2Fs storing $x_{\textrm{GPU}} - x_{\textrm{CPU}}$ and $x_{\textrm{GPU}}:x_{\textrm{CPU}}$ respectively, where $x$ is a given TICL observable.

This PR is not dependent on any other PR.

PR validation:

The PR has been validated through the following pipeline:

cmsenv
mkdir testMatrix
cd testMatrix
runTheMatrix.py -w upgrade -l 36034.7503 -j 0
cd 36034.7503_TTbar_14TeV+Run4D125_HLTHeterogeneousValid
cmsRun TTbar_14TeV_TuneCP5_cfi_GEN_SIM.py
cmsRun step2_DIGI_L1TrackTrigger_L1_L1P2GT_DIGI2RAW_HLT_VALIDATION.py
cmsRun step3_HARVESTING.py

After running it, the aforementioned histograms can be found by opening the ROOT file DQM_V0001_R000000001__Global__CMSSW_X_Y_Z__RECO.root and inspecting the directory DQMData/Run 1/HGCAL/Run summmary.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

This PR is not a backport.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 19, 2026

cms-bot internal usage

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50974/49380

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @fiemmi for master.

It involves the following packages:

  • Configuration/EventContent (operations)
  • DQM/HGCAL (****)
  • HLTrigger/Configuration (hlt)

The following packages do not have a category, yet:

DQM/HGCAL
Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

@Martin-Grunewald, @cmsbuild, @davidlange6, @fabiocos, @ftenchini, @mandrenguyen, @mmusich can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @SohamBhattacharya, @VourMa, @fabiocos, @missirol, @mmusich, @rovere this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

#include "FWCore/Framework/interface/MakerMacros.h"
#include "DataFormats/Candidate/interface/Candidate.h"
#include "DataFormats/CaloRecHit/interface/CaloClusterCollection.h"
#include "DataFormats/CaloRecHit/interface/CaloCluster.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

picky but can you alpha-order?

Comment on lines +35 to +36
edm::EDGetTokenT<reco::CaloClusterCollection> tokenMonitoredLayerClusters_;
edm::EDGetTokenT<reco::CaloClusterCollection> tokenReferenceLayerClusters_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
edm::EDGetTokenT<reco::CaloClusterCollection> tokenMonitoredLayerClusters_;
edm::EDGetTokenT<reco::CaloClusterCollection> tokenReferenceLayerClusters_;
const edm::EDGetTokenT<reco::CaloClusterCollection> tokenMonitoredLayerClusters_;
const edm::EDGetTokenT<reco::CaloClusterCollection> tokenReferenceLayerClusters_;

tokenReferenceLayerClusters_(
consumes<reco::CaloClusterCollection>(iConfig.getParameter<edm::InputTag>("referenceLayerClusters"))) {}

HGCALGPUvsCPUComparisonHists::~HGCALGPUvsCPUComparisonHists() {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
HGCALGPUvsCPUComparisonHists::~HGCALGPUvsCPUComparisonHists() {}

provided the method declaration is declared override

void HGCALGPUvsCPUComparisonHists::beginJob(const edm::EventSetup& iSetup) {}

void HGCALGPUvsCPUComparisonHists::bookHistograms(DQMStore::IBooker& iBooker, edm::Run const&, edm::EventSetup const&) {
iBooker.setCurrentFolder("HGCAL");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be configurable ?
At the very least I'd like the new plot to appear under HLT...

Comment on lines +87 to +88
const std::vector<reco::CaloCluster>& monitoredLayerClusters = iEvent.get(tokenMonitoredLayerClusters_);
const std::vector<reco::CaloCluster>& referenceLayerClusters = iEvent.get(tokenReferenceLayerClusters_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the product is not available?
We don't want to crash processing because of missing input in DQM.

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 19, 2026

@fiemmi in addition to the review above, this relatively simple PR has 15 commits with sometimes not very useful comments, please consider squashing to a minimum. Also

DQM/HGCAL
Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

protected:
void beginJob(const edm::EventSetup& iSetup);
void analyze(const edm::Event& iEvent, const edm::EventSetup& iSetup) override;
void bookHistograms(DQMStore::IBooker& iBooker, edm::Run const& iRun, edm::EventSetup const& iSetup) override;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this module has to run in the HLT, it must have a fillDescriptions. Please provide one.

@cmsbuild
Copy link
Copy Markdown
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50974/49421

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Copy Markdown
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50974/49423

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50974/49424

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: UnitTests
Size: This PR adds an extra 48KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f5699a/53390/summary.html
COMMIT: dadb719
CMSSW: CMSSW_17_0_X_2026-05-20-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50974/53390/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed Unit Tests

I found 1 errors in the following unit tests:

---> test test_check_phase2_hlt_duplicates had ERRORS

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 14 differences found in the comparisons
  • DQMHistoTests: Total files compared: 66
  • DQMHistoTests: Total histograms compared: 4596347
  • DQMHistoTests: Total failures: 68
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4596259
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 65 files compared)
  • DQMHistoSizes: changed ( 34434.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 276 log files, 236 edm output root files, 66 DQM output files
  • TriggerResults: found differences in 1 / 64 workflows

AMD_MI300X Comparison Summary

There are some workflows for which there are errors in the baseline:
34634.402 step 2
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • You potentially removed 80 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 299 differences found in the comparisons
  • DQMHistoTests: Total files compared: 12
  • DQMHistoTests: Total histograms compared: 203010
  • DQMHistoTests: Total failures: 32404
  • DQMHistoTests: Total nulls: 38
  • DQMHistoTests: Total successes: 170568
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 11 files compared)
  • DQMHistoSizes: changed ( 34634.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 47 log files, 48 edm output root files, 12 DQM output files
  • TriggerResults: found differences in 2 / 11 workflows

AMD_W7900 Comparison Summary

Summary:

  • You potentially removed 34 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 364 differences found in the comparisons
  • DQMHistoTests: Total files compared: 13
  • DQMHistoTests: Total histograms compared: 218719
  • DQMHistoTests: Total failures: 40799
  • DQMHistoTests: Total nulls: 31
  • DQMHistoTests: Total successes: 177889
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 12 files compared)
  • DQMHistoSizes: changed ( 34634.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 49 log files, 50 edm output root files, 13 DQM output files
  • TriggerResults: found differences in 4 / 12 workflows

NVIDIA_H100 Comparison Summary

Summary:

  • You potentially removed 13 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 349 differences found in the comparisons
  • DQMHistoTests: Total files compared: 13
  • DQMHistoTests: Total histograms compared: 218719
  • DQMHistoTests: Total failures: 28892
  • DQMHistoTests: Total nulls: 32
  • DQMHistoTests: Total successes: 189795
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 12 files compared)
  • DQMHistoSizes: changed ( 34634.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 49 log files, 50 edm output root files, 13 DQM output files
  • TriggerResults: found differences in 2 / 12 workflows

NVIDIA_L40S Comparison Summary

Summary:

  • You potentially removed 13 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 363 differences found in the comparisons
  • DQMHistoTests: Total files compared: 13
  • DQMHistoTests: Total histograms compared: 218719
  • DQMHistoTests: Total failures: 27816
  • DQMHistoTests: Total nulls: 37
  • DQMHistoTests: Total successes: 190866
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 12 files compared)
  • DQMHistoSizes: changed ( 34634.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 49 log files, 50 edm output root files, 13 DQM output files
  • TriggerResults: found differences in 2 / 12 workflows

Max Memory Comparisons exceeding threshold NVIDIA_H100

@cms-sw/core-l2 , I found 1 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 34634.7503_TTbar_14TeV+Run4D121PU_HLTHeterogeneousValid step2 max memory diff 157.4 exceeds +/- 90.0 MiB

Max Memory Comparisons exceeding threshold NVIDIA_L40S

@cms-sw/core-l2 , I found 1 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 34634.7503_TTbar_14TeV+Run4D121PU_HLTHeterogeneousValid step2 max memory diff 150.3 exceeds +/- 90.0 MiB

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #50974 was updated. @Martin-Grunewald, @ctarricone, @davidlange6, @fabiocos, @ftenchini, @gabrielmscampos, @mandrenguyen, @mmusich, @rseidita can you please check and sign again.

Comment on lines +23 to +24
layerClusters = cms.VInputTag("hltHgcalLayerClustersEE", *ceh_layerClusters),
time_layerclusters = cms.VInputTag("hltHgcalLayerClustersEE:timeLayerCluster", *ceh_time_layerClusters),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should define hltMergeLayerClustersSerialSync directly with the SerialSync collections. Otherwise, at definition time it is identical to hltMergeLayerClusters, and the Phase-2 HLT duplicate-check unit test fails.

Suggested change
layerClusters = cms.VInputTag("hltHgcalLayerClustersEE", *ceh_layerClusters),
time_layerclusters = cms.VInputTag("hltHgcalLayerClustersEE:timeLayerCluster", *ceh_time_layerClusters),
layerClusters = cms.VInputTag("hltHgCalLayerClustersFromSoAProducerSerialSync", *ceh_layerClusters),
time_layerclusters = cms.VInputTag("hltHgCalLayerClustersFromSoAProducerSerialSync:timeLayerCluster", *ceh_time_layerClusters)

Comment on lines +47 to +50
alpakaValidationHLT.toModify(hltMergeLayerClustersSerialSync,
layerClusters = ["hltHgCalLayerClustersFromSoAProducerSerialSync", *ceh_layerClusters],
time_layerclusters = ["hltHgCalLayerClustersFromSoAProducerSerialSync:timeLayerCluster", *ceh_time_layerClusters]
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the comment above, this should no longer be needed

Suggested change
alpakaValidationHLT.toModify(hltMergeLayerClustersSerialSync,
layerClusters = ["hltHgCalLayerClustersFromSoAProducerSerialSync", *ceh_layerClusters],
time_layerclusters = ["hltHgCalLayerClustersFromSoAProducerSerialSync:timeLayerCluster", *ceh_time_layerClusters]
)

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50974/49431

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #50974 was updated. @Martin-Grunewald, @cmsbuild, @ctarricone, @davidlange6, @fabiocos, @ftenchini, @gabrielmscampos, @mandrenguyen, @mmusich, @rseidita can you please check and sign again.

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 21, 2026

@cmsbuild, please test

)

hltHgcalSoARecHitsProducerSerialSync = makeSerialClone(hltHgcalSoARecHitsProducer
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you avoid going into the next line?

# Process modifiers: ticl_barrel and alpaka
from Configuration.ProcessModifiers.alpaka_cff import alpaka
from Configuration.ProcessModifiers.ticl_barrel_cff import ticl_barrel
from Configuration.ProcessModifiers.alpakaValidationHLT_cff import alpakaValidationHLT
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you need this for here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this is just a leftover from the previous update, where we removed the invocation of alpakaValidationHLT.toModify(). It is not needed in the current version of the code. Will be removed in the next update.

@@ -1,4 +1,5 @@
import FWCore.ParameterSet.Config as cms
from HeterogeneousCore.AlpakaCore.functions import makeSerialClone
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you need this for here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along the same lines of the comment above, this is a leftover from a previous version of the code. Will be removed in the next update. Thanks for spotting it.

hltHgcalLayerClustersHSci+
hltHgcalLayerClustersHSi+
hltMergeLayerClustersSerialSync)
alpakaValidationHLT.toReplaceWith(HLTTICLLocalRecoSequence, _HLTTICLLocalRecoSequence_heterogeneousGPUCPU)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the modifier should be explicitly imported in this file.

@@ -1,4 +1,5 @@
import FWCore.ParameterSet.Config as cms
from HeterogeneousCore.AlpakaCore.functions import makeSerialClone
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this needed for here?

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Size: This PR adds an extra 56KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f5699a/53407/summary.html
COMMIT: cd5ccdb
CMSSW: CMSSW_17_0_X_2026-05-20-2300/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50974/53407/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 16 differences found in the comparisons
  • DQMHistoTests: Total files compared: 66
  • DQMHistoTests: Total histograms compared: 4596347
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4596321
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 65 files compared)
  • DQMHistoSizes: changed ( 34434.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 276 log files, 236 edm output root files, 66 DQM output files
  • TriggerResults: found differences in 1 / 64 workflows

AMD_MI300X Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 375 differences found in the comparisons
  • DQMHistoTests: Total files compared: 13
  • DQMHistoTests: Total histograms compared: 218719
  • DQMHistoTests: Total failures: 32349
  • DQMHistoTests: Total nulls: 38
  • DQMHistoTests: Total successes: 186332
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 12 files compared)
  • DQMHistoSizes: changed ( 34634.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 49 log files, 50 edm output root files, 13 DQM output files
  • TriggerResults: found differences in 3 / 12 workflows

AMD_W7900 Comparison Summary

Summary:

  • You potentially added 7 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 378 differences found in the comparisons
  • DQMHistoTests: Total files compared: 13
  • DQMHistoTests: Total histograms compared: 218719
  • DQMHistoTests: Total failures: 31368
  • DQMHistoTests: Total nulls: 33
  • DQMHistoTests: Total successes: 187318
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 12 files compared)
  • DQMHistoSizes: changed ( 34634.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 49 log files, 50 edm output root files, 13 DQM output files
  • TriggerResults: found differences in 3 / 12 workflows

NVIDIA_H100 Comparison Summary

Summary:

  • You potentially removed 7 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 362 differences found in the comparisons
  • DQMHistoTests: Total files compared: 13
  • DQMHistoTests: Total histograms compared: 218719
  • DQMHistoTests: Total failures: 31661
  • DQMHistoTests: Total nulls: 32
  • DQMHistoTests: Total successes: 187026
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 12 files compared)
  • DQMHistoSizes: changed ( 34634.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 49 log files, 50 edm output root files, 13 DQM output files
  • TriggerResults: found differences in 2 / 12 workflows

NVIDIA_L40S Comparison Summary

Summary:

  • You potentially removed 5 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 377 differences found in the comparisons
  • DQMHistoTests: Total files compared: 13
  • DQMHistoTests: Total histograms compared: 218719
  • DQMHistoTests: Total failures: 28591
  • DQMHistoTests: Total nulls: 27
  • DQMHistoTests: Total successes: 190101
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 808.695 KiB( 12 files compared)
  • DQMHistoSizes: changed ( 34634.7503 ): 808.695 KiB HLT/HeterogeneousComparisons
  • Checked 49 log files, 50 edm output root files, 13 DQM output files
  • TriggerResults: found differences in 2 / 12 workflows

Max Memory Comparisons exceeding threshold NVIDIA_H100

@cms-sw/core-l2 , I found 1 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 34634.7503_TTbar_14TeV+Run4D121PU_HLTHeterogeneousValid step2 max memory diff 157.1 exceeds +/- 90.0 MiB

Max Memory Comparisons exceeding threshold NVIDIA_L40S

@cms-sw/core-l2 , I found 1 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 34634.7503_TTbar_14TeV+Run4D121PU_HLTHeterogeneousValid step2 max memory diff 164.5 exceeds +/- 90.0 MiB

Copy link
Copy Markdown
Contributor

@mmusich mmusich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few other comments.

private:
const edm::EDGetTokenT<reco::CaloClusterCollection> tokenMonitoredLayerClusters_;
const edm::EDGetTokenT<reco::CaloClusterCollection> tokenReferenceLayerClusters_;
const std::string topFolderName;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const std::string topFolderName;
const std::string topFolderName_;

to be consistent.

//2D
hLayerCluster2D_x = iBooker.book2D("hLayerCluster2D_x", "hLayerCluster2D_x", 200, -50, 50, 200, -50, 50);
hLayerCluster2D_y = iBooker.book2D("hLayerCluster2D_y", "hLayerCluster2D_y", 200, -50, 50, 200, -50, 50);
hLayerCluster2D_z = iBooker.book2D("hLayerCluster2D_z", "hLayerCluster2D_z", 250, -500, 500, 250, -500, 500);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this amount of bins in the 2D histograms?
The memory footprint of this PR in terms of DQM memory is on the high-ish side. See #50974 (comment)

DQMHistoSizes: Histogram memory added: 808.695 KiB( 65 files compared)
DQMHistoSizes: changed ( 34434.7503 ): 808.695 KiB HLT/HeterogeneousComparisons

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will halve the number of bins on both axes in the next update.

auto seed = (*referenceLayerClusters)[idx].seed();
if (seedToIdx.find(seed) != seedToIdx.end()) {
edm::LogWarning("HGCALGPUvsCPUComparisonHists") << "Duplicate seed in reference collection.";
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you really want to return here, or just continue in the loop?

Comment on lines +98 to +106
edm::Handle<reco::CaloClusterCollection> monitoredLayerClusters_, referenceLayerClusters_;
iEvent.getByToken(tokenMonitoredLayerClusters_, monitoredLayerClusters_);
iEvent.getByToken(tokenReferenceLayerClusters_, referenceLayerClusters_);
if (!(monitoredLayerClusters_.isValid()) || !(referenceLayerClusters_.isValid())) {
edm::LogWarning("HGCALGPUvsCPUComparisonHists") << "Monitored or reference collection is invalid.";
return;
}
const std::vector<reco::CaloCluster>* monitoredLayerClusters = monitoredLayerClusters_.product();
const std::vector<reco::CaloCluster>* referenceLayerClusters = referenceLayerClusters_.product();
Copy link
Copy Markdown
Contributor

@mmusich mmusich May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
edm::Handle<reco::CaloClusterCollection> monitoredLayerClusters_, referenceLayerClusters_;
iEvent.getByToken(tokenMonitoredLayerClusters_, monitoredLayerClusters_);
iEvent.getByToken(tokenReferenceLayerClusters_, referenceLayerClusters_);
if (!(monitoredLayerClusters_.isValid()) || !(referenceLayerClusters_.isValid())) {
edm::LogWarning("HGCALGPUvsCPUComparisonHists") << "Monitored or reference collection is invalid.";
return;
}
const std::vector<reco::CaloCluster>* monitoredLayerClusters = monitoredLayerClusters_.product();
const std::vector<reco::CaloCluster>* referenceLayerClusters = referenceLayerClusters_.product();
const auto& monitoredHandle = iEvent.getHandle(tokenMonitoredLayerClusters_);
const auto& referenceHandle = iEvent.getHandle(tokenReferenceLayerClusters_);
if (!monitoredHandle.isValid() || !referenceHandle.isValid()) {
edm::LogWarning("HGCALGPUvsCPUComparisonHists")
<< "Monitored or reference LayerCluster collection is invalid.";
return;
}
const reco::CaloClusterCollection& monitoredLayerClusters = *monitoredHandle;
const reco::CaloClusterCollection& referenceLayerClusters = *referenceHandle;

  1. Use edm::Handle with auto, and getHandle() instead of getByToken()
  2. Validity check -> same logic, cleaner syntax
  3. Prefer a const reference over a raw pointer
  4. do not use trailing underscores in locals.


//look for GPU and CPU LayerClusters whose seeds match
//map LC seeds to LC indices for the reference collection
std::unordered_map<uint32_t, std::pair<unsigned, bool>>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the corresponding header file is missing for this.

auto it = seedToIdx.find(monitored.seed());
if (it != seedToIdx.end() && it->second.second == false) {
it->second.second = true; //establish a match
const auto& reference = (*referenceLayerClusters)[it->second.first];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole code block L110 to L127 can be rewritten slightly more efficiently:

std::unordered_map<uint32_t, unsigned> seedToIdx;
seedToIdx.reserve(referenceLayerClusters->size());

for (unsigned idx = 0; idx < referenceLayerClusters->size(); idx++) {
    auto [it, inserted] = seedToIdx.try_emplace((*referenceLayerClusters)[idx].seed(), idx);
    if (!inserted) {
        edm::LogWarning("HGCALGPUvsCPUComparisonHists") << "Duplicate seed in reference collection.";
        return;  // continue?
    }
}

std::unordered_set<uint32_t> matched;
matched.reserve(referenceLayerClusters->size());

for (const auto& monitored : *monitoredLayerClusters) {
    auto it = seedToIdx.find(monitored.seed());
    if (it != seedToIdx.end() && !it->second.second) {
        it->second.second = true;
        const auto& reference = (*referenceLayerClusters)[it->second.first];
        // fill histograms...
    }
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of using try_emplace. The current version of the code performs two hash-table operations (find(seed) and operator[](seed)) while try_emplace merges them to just one.
I propose to change

for (unsigned idx = 0; idx < referenceLayerClusters->size(); idx++) {
    auto seed = (*referenceLayerClusters)[idx].seed();
    if (seedToIdx.find(seed) != seedToIdx.end()) {
      edm::LogWarning("HGCALGPUvsCPUComparisonHists") << "Duplicate seed in reference collection.";
      return;
    }
    seedToIdx[seed] = {idx, false};  //initialze all reference LCs as unmatched
  }

to

for (unsigned idx = 0; idx < referenceLayerClusters.size(); idx++) {
  auto [it, inserted] = seedToIdx.try_emplace(referenceLayerClusters[idx].seed(), idx, false); //initialze all reference LCs as unmatched                                
  if (!inserted) {
      edm::LogWarning("HGCALGPUvsCPUComparisonHists") << "Duplicate seed in reference collection.";
      continue;
  }
}

At the same time, this proposal still uses one container (a std::unordered_map<uint32_t, std::pair<unsigned, bool>>) instead of two. Let me know if you would find this acceptable.

hLayerCluster2D_nRecHits->Fill(reference.size(), monitored.size());
} else {
edm::LogWarning("HGCALGPUvsCPUComparisonHists") << "No match or duplicate match to reference collection found.";
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again do you really want to return here?

hltHgcalLayerClustersEE+
hltHgcalLayerClustersHSci+
hltHgcalLayerClustersHSi+
hltMergeLayerClustersSerialSync)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall I am a bit confused by how this sequence is written. Why are the producer instances hltHGCalUncalibRecHit , hltHGCalRecHit and the bloc hltHgcalLayerClustersEE+hltHgcalLayerClustersHSci+hltHgcalLayerClustersHSi repeated twice? Even if the framework elides the duplication is confusing to see them two times in the same sequence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants