Add reduced memory runtime toggle for LST by GNiendorf · Pull Request #50925 · cms-sw/cmssw

GNiendorf · 2026-05-12T15:43:38Z

This PR adds a reduceMemByFullPrecompute runtime flag that enables exact buffer sizing for all LST objects (LS, T3, T5, T4) in each counting kernel, reducing average memory usage from ~80 MB to ~33 MB per event. When the flag is off (default), behavior is identical to master with negligible timing overhead, as the new kernel launches are gated behind host-side if (reduceMemByFullPrecompute_) checks and use templated kernel variants. The flag is exposed as --reduce_mem_by_full_precompute in standalone and as a reduceMemByFullPrecompute config parameter in the CMSSW EDProducer. Increases LST time/event by roughly 10-20% on CPU and GPU (depending on stream count, lower when running multiple streams) for a 60-70% reduction in total memory. Table below shows average and max decreases in memory per event over 100 events.

c.c @slava77

cmsbuild · 2026-05-12T15:44:09Z

cms-bot internal usage

cmsbuild · 2026-05-12T15:46:28Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50925/49310

There are other open Pull requests which might conflict with changes you have proposed:
- File RecoTracker/LST/plugins/alpaka/LSTProducer.cc modified in PR(s): LST: add LSTGeometry package and associated ESProducer #50679
- File RecoTracker/LSTCore/src/alpaka/Quintuplet.h modified in PR(s): LST: add LSTGeometry package and associated ESProducer #50679
- File RecoTracker/LSTCore/standalone/bin/lst.cc modified in PR(s): LST: add LSTGeometry package and associated ESProducer #50679

cmsbuild · 2026-05-12T15:46:57Z

A new Pull Request was created by @GNiendorf for master.

It involves the following packages:

RecoTracker/LST (reconstruction)
RecoTracker/LSTCore (reconstruction)

@Moanwar, @cmsbuild, @jfernan2, @mandrenguyen, @srimanob can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @VinInn, @VourMa, @dgulhan, @elusian, @felicepantaleo, @gpetruc, @mmasciov, @mmusich, @mtosi, @rovere this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

mmusich · 2026-05-12T16:14:24Z

test parameters:

enable = hlt_p2_timing

mmusich · 2026-05-12T16:14:39Z

@cmsbuild, please test with cms-sw/cms-bot#2740

cmsbuild · 2026-05-12T18:50:43Z

+1

Size: This PR adds an extra 80KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-052a34/53207/summary.html
COMMIT: 99fb77a
CMSSW: CMSSW_17_0_X_2026-05-12-1100/el8_amd64_gcc13
Additional Tests: HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50925/53207/install.sh to create a dev area with all the needed externals and cmssw changes.

HLT P2 Timing: chart

Comparison Summary

Summary:

You potentially added 1 lines to the logs
Reco comparison results: 4 differences found in the comparisons
DQMHistoTests: Total files compared: 55
DQMHistoTests: Total histograms compared: 4420967
DQMHistoTests: Total failures: 0
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4420947
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 54 files compared)
Checked 235 log files, 207 edm output root files, 55 DQM output files
TriggerResults: no differences found

jfernan2 · 2026-05-13T08:19:15Z

assign heterogeneous

cmsbuild · 2026-05-13T08:19:39Z

New categories assigned: heterogeneous

@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

makortel · 2026-05-13T13:49:53Z

assign heterogeneous

Do you have specific question(s) in mind?

jfernan2 · 2026-05-13T16:11:49Z

Sorry to bother, I was not fully sure about the use of the kernel in RecoTracker/LSTCore/src/alpaka/LSTEvent.dev.cc within Alpaka.
Please forgive my ignorance about this type of structures. The rest of the PR looks ok to me. Thank you

makortel · 2026-05-14T20:22:35Z

test parameters:

enable = hlt_p2_timing,gpu

makortel · 2026-05-14T20:22:41Z

@cmsbuild, please test

makortel · 2026-05-14T20:23:34Z

I'm not seeing anything obviously concerning (beyond the presumable code size and compilation time increase from the two instantiations of the kernel class templates).

cmsbuild · 2026-05-17T10:48:03Z

-1

Failed Tests: RelVals-AMD_MI300X
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-052a34/53259/summary.html
COMMIT: 99fb77a
CMSSW: CMSSW_17_0_X_2026-05-14-1700/el8_amd64_gcc13
Additional Tests: HLT_P2_TIMING,GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50925/53259/install.sh to create a dev area with all the needed externals and cmssw changes.

HLT P2 Timing: chart

Failed RelVals-AMD_MI300X

34634.40334634.403_TTbar_14TeV+Run4D121PU_Patatrack_PixelOnlyAlpaka_Validation/step2_TTbar_14TeV+Run4D121PU_Patatrack_PixelOnlyAlpaka_Validation.log

Comparison Summary

Summary:

You potentially removed 9 lines from the logs
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 9 differences found in the comparisons
DQMHistoTests: Total files compared: 55
DQMHistoTests: Total histograms compared: 4420967
DQMHistoTests: Total failures: 16
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4420931
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 54 files compared)
Checked 235 log files, 207 edm output root files, 55 DQM output files
TriggerResults: no differences found

AMD_W7900 Comparison Summary

Summary:

You potentially removed 11 lines from the logs
Reco comparison results: 371 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216259
DQMHistoTests: Total failures: 29426
DQMHistoTests: Total nulls: 36
DQMHistoTests: Total successes: 186797
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 2 / 12 workflows

NVIDIA_H100 Comparison Summary

Summary:

You potentially removed 15 lines from the logs
Reco comparison results: 328 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216259
DQMHistoTests: Total failures: 35220
DQMHistoTests: Total nulls: 37
DQMHistoTests: Total successes: 181002
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: no differences found

NVIDIA_L40S Comparison Summary

Summary:

You potentially added 1 lines to the logs
Reco comparison results: 343 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216259
DQMHistoTests: Total failures: 25384
DQMHistoTests: Total nulls: 32
DQMHistoTests: Total successes: 190843
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 1 / 12 workflows

jfernan2 · 2026-05-18T10:12:02Z

+1

fwyzard · 2026-05-18T10:56:33Z

+heterogeneous

Code changes look OK.

MI300X failure is a recurring problem, and seems unrelated to these changes.

fwyzard · 2026-05-18T10:56:46Z

ignore tests-rejected with ib-failure

cmsbuild · 2026-05-18T10:57:01Z

This pull request is fully signed and it will be integrated in one of the next master IBs (test failures were overridden). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @ftenchini, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2)

Add option to reduce LST memory through precompute

99fb77a

cmsbuild added this to the CMSSW_17_0_X milestone May 12, 2026

cmsbuild added reconstruction-pending pending-signatures tests-pending orp-pending code-checks-pending tracking labels May 12, 2026

cmsbuild added code-checks-approved and removed code-checks-pending labels May 12, 2026

cmsbuild added tests-started requires-external and removed tests-pending labels May 12, 2026

cmsbuild added tests-approved and removed tests-started labels May 12, 2026

cmsbuild added the heterogeneous-pending label May 13, 2026

cmsbuild removed requires-external tests-approved labels May 14, 2026

cmsbuild added the tests-started label May 14, 2026

cmsbuild added tests-rejected and removed tests-started labels May 17, 2026

cmsbuild added reconstruction-approved and removed reconstruction-pending labels May 18, 2026

cmsbuild added fully-signed tests-approved heterogeneous-approved tests-ib-failure and removed pending-signatures tests-rejected heterogeneous-pending labels May 18, 2026

Conversation

GNiendorf commented May 12, 2026

Uh oh!

cmsbuild commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmsbuild commented May 12, 2026

Uh oh!

cmsbuild commented May 12, 2026

Uh oh!

mmusich commented May 12, 2026

Uh oh!

mmusich commented May 12, 2026

Uh oh!

cmsbuild commented May 12, 2026

Comparison Summary

Uh oh!

jfernan2 commented May 13, 2026

Uh oh!

cmsbuild commented May 13, 2026

Uh oh!

makortel commented May 13, 2026

Uh oh!

jfernan2 commented May 13, 2026

Uh oh!

makortel commented May 14, 2026

Uh oh!

makortel commented May 14, 2026

Uh oh!

makortel commented May 14, 2026

Uh oh!

cmsbuild commented May 17, 2026

Failed RelVals-AMD_MI300X

Comparison Summary

AMD_W7900 Comparison Summary

NVIDIA_H100 Comparison Summary

NVIDIA_L40S Comparison Summary

Uh oh!

jfernan2 commented May 18, 2026

Uh oh!

fwyzard commented May 18, 2026

Uh oh!

fwyzard commented May 18, 2026

Uh oh!

cmsbuild commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

cmsbuild commented May 12, 2026 •

edited

Loading