Skip to content

SoA backend cleanup#50131

Merged
cmsbuild merged 3 commits into
cms-sw:masterfrom
Electricks94:SoAClean
Mar 11, 2026
Merged

SoA backend cleanup#50131
cmsbuild merged 3 commits into
cms-sw:masterfrom
Electricks94:SoAClean

Conversation

@Electricks94
Copy link
Copy Markdown
Contributor

This PR introduces various cleanups to the SoA backend:

  1. Removal of TupleOrPointerType in SoAParametersImpl. TupleOrPointerType is replaced by member functions that return the data address (and the stride in the case of an Eigen Column).
  2. Removal of the checkAlignment function in SoAParametersImpl. The function was mostly code duplication across the four specializations of SoAParametersImpl. It is now a free function in the cms::soa::detail namespace.
  3. AccumulateColumnByteSizes and computePitch were essentially code duplication. It is replaced by ComputePitch which is a rename of AccumulateColumnByteSizes and replaces the function computePitch entirely.
  4. printColumn is implemented as a PrintColumn struct now
  5. Manual alignment checks are replaced by the checkAlignment function
  6. Removal of unused _DECLARE_VIEW_CONSTRUCTION_BYCOLUMN_PARAMETERS and _DECLARE_VIEW_MEMBER_INITIALIZERS_BYCOLUMN macros
  7. Removal of a large comment of a dumped SoA in SoALayout by a test case that actually dumps an SoA and checks the output string
  8. _deepCopy was implemented in PortableDeviceCollection and PortableHostCollection with the exact same functionality. It is not moved to the portablecollection namespace and called in both places.

Fyi @felicepantaleo

@Electricks94
Copy link
Copy Markdown
Contributor Author

type ngt

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Feb 13, 2026

cms-bot internal usage

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50131/48054

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @Electricks94 for master.

It involves the following packages:

  • DataFormats/Portable (heterogeneous)
  • DataFormats/SoATemplate (heterogeneous)
  • PhysicsTools/PyTorchAlpaka (heterogeneous, ml)

@cmsbuild, @fwyzard, @hjkwon260, @makortel, @valsdav, @y19y19 can you please review it and eventually sign? Thanks.
@missirol, @mmusich, @rovere this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

soa_impl_os << " Scalar " << name << " at offset " << soa_impl_offset << " has size " << sizeof(T)
<< " and padding " << ((sizeof(T) - 1) / alignment + 1) * alignment - sizeof(T) << std::endl;
soa_impl_offset += ((sizeof(T) - 1) / alignment + 1) * alignment;
SOA_INLINE bool checkAlignment(const T* addr, byte_size_type alignment) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that all SOA_INLINE I could find in cmssw are preceded by a SOA_HOST_DEVICE. Is there any reason why SOA_HOST_DEVICE is not used here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlining functions that are also executed on the device is a very common case in the SoA backend. Here, however, I don't really see a usecase to check the alignment on the device side. Hence, I left out SOA_HOST_DEVICE


namespace cms::soa::detail {
// Helper function for streaming column
// Helper function to check alignment of a pointer. Returns true if the pointer is not aligned to the specified alignment.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since checkAlignment returns true if the pointer is not aligned, should it better be named checkMisaligned, checkMisalignment, isMisaligned, etc.?

or it could return true when the pointer is aligned instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think isMisaligned is the most suitable name. I adapted it

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50131/48106

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #50131 was updated. @cmsbuild, @fwyzard, @hjkwon260, @makortel, @valsdav, @y19y19 can you please check and sign again.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add

#include <alpaka/alpaka.hpp>

when you use directly alpaka functions.

, \
/* Eigen column */ \
if (not readyToSet) { \
if (!readyToSet) { \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the not naming.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is clearer to read or is there a reason I oversee?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is more expressive and clearer to read.

} \
constexpr static cms::soa::SoAColumnType BOOST_PP_CAT(ColumnTypeOf_, NAME) = cms::soa::SoAColumnType::eigen; \
) \
) \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep the indentation lined up

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50131/48108

@slava77
Copy link
Copy Markdown
Contributor

slava77 commented Feb 23, 2026

is it practical to add these definitions in lst.cc file in this PR

// Dummy implementation of edm::typeDemangle (without extra replacements)
// to avoid having to link extra libraries
namespace edm {
std::string typeDemangle(char const *mangledName) { return boost::core::demangle(mangledName); }
} // namespace edm

or should we followup with a separate PR?

@makortel
Copy link
Copy Markdown
Contributor

that it is preferable to throw exceptions in code that is compiled only once, instead of having them in header files that are included in many other files and packages.

Right, throwing an exception generates quite a bit of machine code, so this way the dependent shared libraries can be kept a little bit smaller.

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Feb 26, 2026

While my preference would be to link the library form CMSSW,

is it practical to add these definitions in lst.cc file in this PR

yes, this is also a solution.

@Electricks94 can you copy-paste the content of DataFormats/SoATemplate/src/SoACommon.cc at the bottom of RecoTracker/LSTCore/standalone/bin/lst.cc, with a comment saying where it comes from and why it is added ?

@cmsbuild
Copy link
Copy Markdown
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50131/48300

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50131/48301

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #50131 was updated. @Moanwar, @civanch, @cmsbuild, @fwyzard, @hjkwon260, @jfernan2, @kpedro88, @makortel, @mandrenguyen, @mdhildreth, @srimanob, @valsdav, @y19y19 can you please check and sign again.

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Feb 27, 2026

+heterogenous

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Feb 27, 2026

test parameters:

  • enable = gpu
  • gpu = nvidia,amd,nvidia_t4

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Feb 27, 2026

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: nvidia_l40sUnitTests
Size: This PR adds an extra 28KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a64254/51661/summary.html
COMMIT: 9c162c3
CMSSW: CMSSW_16_1_X_2026-02-26-2300/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S,NVIDIA_T4
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50131/51661/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 4 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 5 differences found in the comparisons
  • DQMHistoTests: Total files compared: 55
  • DQMHistoTests: Total histograms compared: 4406167
  • DQMHistoTests: Total failures: 628
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4405519
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 54 files compared)
  • Checked 235 log files, 208 edm output root files, 55 DQM output files
  • TriggerResults: no differences found

AMD_MI300X Comparison Summary

Summary:

AMD_W7900 Comparison Summary

Summary:

NVIDIA_H100 Comparison Summary

Summary:

NVIDIA_L40S Comparison Summary

Summary:

NVIDIA_T4 Comparison Summary

Summary:

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Mar 1, 2026

@Electricks94
Copy link
Copy Markdown
Contributor Author

@Electricks94 any idea why a unit test failed on the L40S ?

https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a64254/51661/nvidia_l40sUnitTests/src/PhysicsTools/PyTorchAlpaka/test/testSoADataTypesCudaAsync/testing.log

I tried to reproduce the error locally on the NGT Farm but the test case did not produce any errors. Could you please rerun the test case so that I can see the error message in the pipeline?

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Mar 9, 2026

sure

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Mar 9, 2026

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-a64254/51851/summary.html
COMMIT: 9c162c3
CMSSW: CMSSW_16_1_X_2026-03-09-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S,NVIDIA_T4
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/50131/51851/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 2 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 55
  • DQMHistoTests: Total histograms compared: 4414537
  • DQMHistoTests: Total failures: 243
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4414274
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 54 files compared)
  • Checked 235 log files, 208 edm output root files, 55 DQM output files
  • TriggerResults: no differences found

AMD_MI300X Comparison Summary

Summary:

AMD_W7900 Comparison Summary

Summary:

NVIDIA_H100 Comparison Summary

Summary:

NVIDIA_L40S Comparison Summary

Summary:

NVIDIA_T4 Comparison Summary

Summary:

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Mar 10, 2026

+heterogeneous

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented Mar 11, 2026

urgent

I would like to have this in 16.1.0-pre3, if possible.

@cms-sw/ml-l2 @cms-sw/reconstruction-l2 @cms-sw/simulation-l2 could you review the changes, and eventually sign them off, please ?

@hjkwon260
Copy link
Copy Markdown
Contributor

+ml

@civanch
Copy link
Copy Markdown
Contributor

civanch commented Mar 11, 2026

+1

1 similar comment
@jfernan2
Copy link
Copy Markdown
Contributor

+1

@cmsbuild
Copy link
Copy Markdown
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @ftenchini, @sextonkennedy, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2)

@ftenchini
Copy link
Copy Markdown

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants