Implement `stream::FixedQueueEDProducer` by fwyzard · Pull Request #50627 · cms-sw/cmssw

fwyzard · 2026-04-01T21:05:24Z

PR description:

This PR builds on top of #50675.

Implement a new kind of alpaka stream::EDProducer with a fixed association of device queues (e.g. CUDA streams) to framework streams.

This is useful for using external software that associates resources to the device queues, for example the PyTorch device memory caching allocator.

Migrating the PyTorch alpaka modules from stream::EDProducer to stream::FixedQueueEDProducer ensures that PyTorch sees only a limited number of device queues, reducing the overall device memory utilisation.

For more background information see the presentation ML inference on GPUs in CMSSW with PyTorch by @EmanueleCoradin at the CMS developments with GPUs on March 30th, 2026.

PR validation:

All unit tests pass.

cmsbuild · 2026-04-01T21:05:50Z

cms-bot internal usage

fwyzard · 2026-04-01T21:08:09Z

enable gpu

fwyzard · 2026-04-01T21:08:12Z

please test

cmsbuild · 2026-04-01T21:08:16Z

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50627/48825

ERROR: Build errors found during clang-tidy run.

Suppressed 1322 warnings (1318 in non-user code, 4 NOLINT).
--
src/HeterogeneousCore/AlpakaCore/interface/alpaka/stream/FixedQueueEDProducer.h:32:11: error: 'maybe_unused' attribute cannot be applied to a statement [clang-diagnostic-error]
   32 |         [[maybe_unused]] ev.queue();
      |           ^              ~~
Suppressed 2968 warnings (2964 in non-user code, 4 NOLINT).
--
src/HeterogeneousCore/AlpakaCore/interface/alpaka/stream/FixedQueueEDProducer.h:32:11: error: 'maybe_unused' attribute cannot be applied to a statement [clang-diagnostic-error]
   32 |         [[maybe_unused]] ev.queue();
      |           ^              ~~
Suppressed 2966 warnings (2962 in non-user code, 4 NOLINT).
--
src/HeterogeneousCore/AlpakaCore/interface/alpaka/stream/FixedQueueEDProducer.h:32:11: error: 'maybe_unused' attribute cannot be applied to a statement [clang-diagnostic-error]
   32 |         [[maybe_unused]] ev.queue();
      |           ^              ~~
Suppressed 2974 warnings (2970 in non-user code, 4 NOLINT).
--
src/HeterogeneousCore/AlpakaCore/interface/alpaka/stream/FixedQueueEDProducer.h:32:11: error: 'maybe_unused' attribute cannot be applied to a statement [clang-diagnostic-error]
   32 |         [[maybe_unused]] ev.queue();
      |           ^              ~~
Suppressed 2966 warnings (2962 in non-user code, 4 NOLINT).
--
gmake: *** [config/SCRAM/GMake/Makefile.coderules:129: code-checks] Error 2
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

fwyzard · 2026-04-01T21:15:51Z

please test

cmsbuild · 2026-04-01T21:17:56Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50627/48826

There are other open Pull requests which might conflict with changes you have proposed:
- File HeterogeneousCore/AlpakaCore/README.md modified in PR(s): Various fixes for heterogeneous utilities #47605
- File HeterogeneousCore/AlpakaCore/interface/alpaka/EDMetadataSentry.h modified in PR(s): Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503
- File HeterogeneousCore/AlpakaCore/src/alpaka/EDMetadataSentry.cc modified in PR(s): [hack] Force one GPU queue per framework stream #49547

cmsbuild · 2026-04-01T21:18:18Z

A new Pull Request was created by @fwyzard for master.

It involves the following packages:

HeterogeneousCore/AlpakaCore (heterogeneous)
HeterogeneousCore/AlpakaTest (heterogeneous)
PhysicsTools/PyTorchAlpakaTest (heterogeneous, ml)

@fwyzard, @hjkwon260, @makortel, @valsdav, @y19y19 can you please review it and eventually sign? Thanks.
@makortel, @rovere this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

cmsbuild · 2026-04-02T01:42:38Z

+1

Size: This PR adds an extra 44KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-35ac0e/52411/summary.html
COMMIT: 8b5045e
CMSSW: CMSSW_16_1_X_2026-04-01-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/50627/52411/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-35ac0e/52411/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-35ac0e/52411/git-merge-result

Comparison Summary

The workflows 2025.0010001, 2025.0000002, 2024.0070001, 2024.0060001, 2024.0050001, 2024.0040001, 2024.0030001, 2024.0020001, 2024.0010001, 2024.0000001, 2023.0020001 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons

Summary:

You potentially removed 299 lines from the logs
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 41482 differences found in the comparisons
DQMHistoTests: Total files compared: 52
DQMHistoTests: Total histograms compared: 3449714
DQMHistoTests: Total failures: 162
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3449532
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 1.876 KiB( 40 files compared)
DQMHistoSizes: changed ( 18434.0,... ): 0.938 KiB HLT/ScoutingOffline
Checked 223 log files, 193 edm output root files, 52 DQM output files
TriggerResults: no differences found

AMD_MI300X Comparison Summary

Summary:

You potentially removed 26 lines from the logs
Reco comparison results: 351 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 39130
DQMHistoTests: Total nulls: 29
DQMHistoTests: Total successes: 177380
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 3 / 12 workflows

AMD_W7900 Comparison Summary

Summary:

You potentially added 29 lines to the logs
Reco comparison results: 367 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 30336
DQMHistoTests: Total nulls: 39
DQMHistoTests: Total successes: 186164
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 1 / 12 workflows

NVIDIA_H100 Comparison Summary

Summary:

You potentially removed 17 lines from the logs
Reco comparison results: 367 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 30937
DQMHistoTests: Total nulls: 35
DQMHistoTests: Total successes: 185567
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: no differences found

NVIDIA_L40S Comparison Summary

Summary:

You potentially added 20 lines to the logs
Reco comparison results: 366 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 29805
DQMHistoTests: Total nulls: 28
DQMHistoTests: Total successes: 186706
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: no differences found

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 6 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...

Error: Workflow 2023.0020001_RunJetMET02023D_10k step3 max memory diff 329.9 exceeds +/- 90.0 MiB
Error: Workflow 2024.0000001_RunZeroBias2024B_10k step3 max memory diff -96.5 exceeds +/- 90.0 MiB
Error: Workflow 2024.0010001_RunJetMET02024C_10k step3 max memory diff 111.9 exceeds +/- 90.0 MiB
Error: Workflow 2024.0030001_RunDisplacedJet2024E_10k step3 max memory diff 184.7 exceeds +/- 90.0 MiB
Error: Workflow 2024.0050001_RunBTagMu2024G_10k step3 max memory diff 110.3 exceeds +/- 90.0 MiB
Error: Workflow 2025.0010001_RunJetMET02025C_10k step3 max memory diff 179.0 exceeds +/- 90.0 MiB

cmsbuild · 2026-04-06T15:55:38Z

Pull request #50627 was updated. @fwyzard, @hjkwon260, @makortel, @valsdav, @y19y19 can you please check and sign again.

cmsbuild · 2026-04-06T21:02:48Z

+1

Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-35ac0e/52492/summary.html
COMMIT: fdc8269
CMSSW: CMSSW_17_0_X_2026-04-06-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50627/52492/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially added 5 lines to the logs
Reco comparison results: 3 differences found in the comparisons
DQMHistoTests: Total files compared: 53
DQMHistoTests: Total histograms compared: 4180749
DQMHistoTests: Total failures: 47
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4180682
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
Checked 227 log files, 197 edm output root files, 53 DQM output files
TriggerResults: no differences found

AMD_MI300X Comparison Summary

Summary:

You potentially removed 9 lines from the logs
Reco comparison results: 368 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 33650
DQMHistoTests: Total nulls: 37
DQMHistoTests: Total successes: 182852
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 3 / 12 workflows

AMD_W7900 Comparison Summary

Summary:

You potentially added 8 lines to the logs
Reco comparison results: 385 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 31880
DQMHistoTests: Total nulls: 34
DQMHistoTests: Total successes: 184625
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 1 / 12 workflows

NVIDIA_H100 Comparison Summary

Summary:

You potentially removed 13 lines from the logs
Reco comparison results: 352 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 31174
DQMHistoTests: Total nulls: 29
DQMHistoTests: Total successes: 185336
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 2 / 12 workflows

NVIDIA_L40S Comparison Summary

Summary:

You potentially added 30 lines to the logs
Reco comparison results: 378 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 30765
DQMHistoTests: Total nulls: 33
DQMHistoTests: Total successes: 185741
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: no differences found

fwyzard · 2026-04-07T10:28:52Z

@makortel I've split the first part of this PR, rewritten to allocate device queues only as needed, into #50675 .

If those changes look good I will rebase and update this PR on top of them.

stream::FixedQueueEDProducer is a stream EDProducer with a fixed association of device queues to framework streams.

This ensures that PyTorch sees only a limited number of device streams, reducing the overall device memory utilisation.

cmsbuild · 2026-04-07T15:56:13Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50627/48906

There are other open Pull requests which might conflict with changes you have proposed:
- File DataFormats/AlpakaCommon/interface/alpaka/EDMetadata.h modified in PR(s): Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675
- File DataFormats/AlpakaCommon/src/alpaka/EDMetadata.cc modified in PR(s): Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/README.md modified in PR(s): Various fixes for heterogeneous utilities #47605, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/interface/CopyToDeviceCache.h modified in PR(s): Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/interface/alpaka/EDMetadataSentry.h modified in PR(s): Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/interface/alpaka/Event.h modified in PR(s): Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/interface/alpaka/ProducerBase.h modified in PR(s): [hack] Force one GPU queue per framework stream #49547, Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/interface/alpaka/Record.h modified in PR(s): Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/src/alpaka/EDMetadataAcquireSentry.cc modified in PR(s): [hack] Force one GPU queue per framework stream #49547, Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/src/alpaka/EDMetadataSentry.cc modified in PR(s): [hack] Force one GPU queue per framework stream #49547, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/src/alpaka/ESProducer.cc modified in PR(s): Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaInterface/interface/EventCache.h modified in PR(s): Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/interface/EventCache.h modified in PR(s): Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaInterface/interface/QueueCache.h modified in PR(s): Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaCore/interface/QueueCache.h modified in PR(s): [hack] Force one GPU queue per framework stream #49547, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/AlpakaServices/src/alpaka/AlpakaService.cc modified in PR(s): [hack] Force one GPU queue per framework stream #49547, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/TrivialSerialisation/interface/alpaka/Serialiser.h modified in PR(s): Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/TrivialSerialisation/interface/alpaka/SerialiserBase.h modified in PR(s): Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675
- File HeterogeneousCore/TrivialSerialisation/test/alpaka/test_catch2_portableCollectionsSerialiserPluginFactory.dev.cc modified in PR(s): Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503, Rewrite the interface of EDMetadata #50675

cmsbuild · 2026-04-07T15:56:37Z

Pull request #50627 was updated. @cmsbuild, @fwyzard, @hjkwon260, @makortel, @valsdav, @y19y19 can you please check and sign again.

fwyzard · 2026-04-07T16:04:39Z

@cms-sw/ml-l2 do you have any comments or suggestions?

fwyzard · 2026-04-07T16:05:03Z

enable gpu

fwyzard · 2026-04-07T16:05:07Z

please test

fwyzard · 2026-04-07T16:05:42Z

type ngt

cmsbuild · 2026-04-08T01:35:38Z

+1

Size: This PR adds an extra 36KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-35ac0e/52512/summary.html
COMMIT: 00541fe
CMSSW: CMSSW_17_0_X_2026-04-07-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50627/52512/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

You potentially added 1 lines to the logs
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 2 differences found in the comparisons
DQMHistoTests: Total files compared: 53
DQMHistoTests: Total histograms compared: 4180749
DQMHistoTests: Total failures: 3
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 4180726
DQMHistoTests: Total skipped: 20
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
Checked 227 log files, 197 edm output root files, 53 DQM output files
TriggerResults: no differences found

AMD_MI300X Comparison Summary

Summary:

You potentially removed 11 lines from the logs
Reco comparison results: 332 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 39250
DQMHistoTests: Total nulls: 34
DQMHistoTests: Total successes: 177255
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 5 / 12 workflows

AMD_W7900 Comparison Summary

Summary:

You potentially removed 29 lines from the logs
Reco comparison results: 373 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 32062
DQMHistoTests: Total nulls: 35
DQMHistoTests: Total successes: 184442
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 2 / 12 workflows

NVIDIA_H100 Comparison Summary

Summary:

You potentially added 2 lines to the logs
Reco comparison results: 336 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 37208
DQMHistoTests: Total nulls: 36
DQMHistoTests: Total successes: 179295
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 1 / 12 workflows

NVIDIA_L40S Comparison Summary

Summary:

You potentially removed 18 lines from the logs
Reco comparison results: 374 differences found in the comparisons
DQMHistoTests: Total files compared: 13
DQMHistoTests: Total histograms compared: 216539
DQMHistoTests: Total failures: 32586
DQMHistoTests: Total nulls: 36
DQMHistoTests: Total successes: 183917
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 12 files compared)
Checked 49 log files, 50 edm output root files, 13 DQM output files
TriggerResults: found differences in 2 / 12 workflows

Max Memory Comparisons exceeding threshold NVIDIA_L40S

@cms-sw/core-l2 , I found 1 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...

Error: Workflow 34634.7503_TTbar_14TeV+Run4D121PU_HLTHeterogeneousValid step2 max memory diff 118.4 exceeds +/- 90.0 MiB

makortel · 2026-04-08T14:16:47Z

Looks ok to me

fwyzard · 2026-04-08T14:27:42Z

+heterogeneous

hjkwon260 · 2026-04-10T13:30:45Z

+ml

cmsbuild · 2026-04-10T13:31:09Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @sextonkennedy, @ftenchini (and backports should be raised in the release meeting by the corresponding L2)

mandrenguyen · 2026-04-11T05:57:22Z

+1

cmsbuild added this to the CMSSW_16_1_X milestone Apr 1, 2026

cmsbuild added pending-signatures tests-pending orp-pending code-checks-pending heterogeneous-pending ml-pending labels Apr 1, 2026

cmsbuild added tests-started code-checks-rejected and removed tests-pending code-checks-pending labels Apr 1, 2026

fwyzard force-pushed the FixedQueueEDProducer branch from 0dff13b to 8b5045e Compare April 1, 2026 21:14

cmsbuild added tests-pending code-checks-pending and removed tests-started code-checks-rejected labels Apr 1, 2026

cmsbuild added tests-started and removed tests-pending labels Apr 1, 2026

cmsbuild added code-checks-approved and removed code-checks-pending labels Apr 1, 2026

cmsbuild added tests-approved and removed tests-started labels Apr 2, 2026

cmsbuild added code-checks-approved and removed code-checks-pending labels Apr 6, 2026

cmsbuild added tests-approved and removed tests-started labels Apr 6, 2026

cmsbuild mentioned this pull request Apr 7, 2026

Rewrite the interface of EDMetadata #50675

Merged

fwyzard added 3 commits April 7, 2026 17:26

Implement stream::FixedQueueEDProducer

6b96a4b

stream::FixedQueueEDProducer is a stream EDProducer with a fixed association of device queues to framework streams.

Add a unit test for stream::FixedQueueEDProducer

59aa7ac

Migrate PyTorch alpaka modules to stream::FixedQueueEDProducer

00541fe

This ensures that PyTorch sees only a limited number of device streams, reducing the overall device memory utilisation.

fwyzard force-pushed the FixedQueueEDProducer branch from fdc8269 to 00541fe Compare April 7, 2026 15:54

cmsbuild added tests-pending and removed tests-approved code-checks-approved labels Apr 7, 2026

This was referenced Apr 7, 2026

Implement a GenericClonerDevice.cc as a test for the device TrivialSerialisation mechanism #50685

Closed

Add an MPISenderPortable and MPIReceiverPortable modules to send/receive arbitrary device collections #50503

Open

cmsbuild mentioned this pull request Apr 11, 2026

Revision of the PSimHit type storage, revert trackId to its basic meaning #49969

Merged

Conversation

fwyzard commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR description:

PR validation:

Uh oh!

cmsbuild commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fwyzard commented Apr 1, 2026

Uh oh!

fwyzard commented Apr 1, 2026

Uh oh!

cmsbuild commented Apr 1, 2026

Uh oh!

fwyzard commented Apr 1, 2026

Uh oh!

cmsbuild commented Apr 1, 2026

Uh oh!

cmsbuild commented Apr 1, 2026

Uh oh!

cmsbuild commented Apr 2, 2026

Comparison Summary

AMD_MI300X Comparison Summary

AMD_W7900 Comparison Summary

NVIDIA_H100 Comparison Summary

NVIDIA_L40S Comparison Summary

Max Memory Comparisons exceeding threshold

Uh oh!

cmsbuild commented Apr 6, 2026

Uh oh!

cmsbuild commented Apr 6, 2026

Comparison Summary

AMD_MI300X Comparison Summary

AMD_W7900 Comparison Summary

NVIDIA_H100 Comparison Summary

NVIDIA_L40S Comparison Summary

Uh oh!

fwyzard commented Apr 7, 2026

Uh oh!

cmsbuild commented Apr 7, 2026

Uh oh!

cmsbuild commented Apr 7, 2026

Uh oh!

fwyzard commented Apr 7, 2026

Uh oh!

fwyzard commented Apr 7, 2026

Uh oh!

fwyzard commented Apr 7, 2026

Uh oh!

fwyzard commented Apr 7, 2026

Uh oh!

cmsbuild commented Apr 8, 2026

Comparison Summary

AMD_MI300X Comparison Summary

AMD_W7900 Comparison Summary

NVIDIA_H100 Comparison Summary

NVIDIA_L40S Comparison Summary

Max Memory Comparisons exceeding threshold NVIDIA_L40S

Uh oh!

makortel commented Apr 8, 2026

Uh oh!

fwyzard commented Apr 8, 2026

Uh oh!

hjkwon260 commented Apr 10, 2026

Uh oh!

cmsbuild commented Apr 10, 2026

Uh oh!

mandrenguyen commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

fwyzard commented Apr 1, 2026 •

edited

Loading

cmsbuild commented Apr 1, 2026 •

edited

Loading