[WIP] `MASS3DPA` element batching by michaelmckinsey1 · Pull Request #693 · llnl/RAJAPerf

michaelmckinsey1 · 2026-06-19T21:00:34Z

Summary

This PR is an update for MASS3DPA
It updates the MASS3DPA implementation to use element batching as was done in the reference implementation a few months ago Add element batching capabilities to 3D MassIntegrator mfem/mfem#5299
Testing RAJA_CUDA vs Laghos with mfem+raja, the runtimes for MASS3DPA and SmemPAMassApply3D_Element are within 0.8% difference, however MASS3DPA is doing 10% less instructions. Maybe it is my configuration of Q1D/D1D?
Need to update/test the HIP version and CPU
Test Base_CUDA?
Update MASSVEC3DPA?
Batch size tunings

michaelmckinsey1 · 2026-06-19T21:14:01Z

+// linear
+//   constexpr RAJA::Index_type D1D = 2;
+//   constexpr RAJA::Index_type Q1D = 2;
+// }
+// quadratic
+//   constexpr RAJA::Index_type D1D = 4;
+//   constexpr RAJA::Index_type Q1D = 4;
+// cubic
+// constexpr RAJA::Index_type D1D = 6;
+//   constexpr RAJA::Index_type Q1D = 6;
+  constexpr RAJA::Index_type D1D = 2;
+  constexpr RAJA::Index_type Q1D = 2;
+
+  constexpr RAJA::Index_type TBATCH = 16; // linear
+  // constexpr RAJA::Index_type TBATCH = 2; // quadratic
+  // constexpr RAJA::Index_type TBATCH = 1; // cubic


This is my observation of the values via Laghos. Looking for feedback here and whether we should try to have different configurations of MASS3DPA for different orders?

First what do we want to cover? If we want to do more, I have thought about templating entire kernels to handle things like different orders and maybe using explicit template instantiation to keep the instantiation in separate files similar.

It would also be helpful to see how the performance of quadratic compares with and without RAJA since that's what MARBL mainly uses.

artv3 · 2026-06-19T22:06:27Z

Nice @michaelmckinsey1 ! What would be neat is to add a batching parameter and allow for different batch sizes, maybe 4 is good GPU X or 7 is good for GPU Y type of thing

MrBurmark · 2026-06-19T22:38:25Z

Nice @michaelmckinsey1 ! What would be neat is to add a batching parameter and allow for different batch sizes, maybe 4 is good GPU X or 7 is good for GPU Y type of thing

I think he means different batch sizes as tunings.

artv3 · 2026-06-19T22:53:59Z

Nice @michaelmckinsey1 ! What would be neat is to add a batching parameter and allow for different batch sizes, maybe 4 is good GPU X or 7 is good for GPU Y type of thing

I think he means different batch sizes as tunings.

yes, sounds good to me!

helloworld922 · 2026-06-22T17:47:05Z

+      if (valid_e) {
+        MASS3DPA_1
+      }


Instead of doing the check for valid_e inside of every loop what happens if you do a single check if(!valid_e) return; at the beginning? That's how we have it implemented in MFEM.

See efaa790

helloworld922 · 2026-06-22T17:54:51Z

+    }
+  }
  __syncthreads();
  GPU_FOREACH_THREAD_INC(dy, y, mpa::D1D, mpa::Q1D) {


It would be interesting to compare having this be a for loop vs. doing a simple if statement check (blockDim.x and blockDim.y should always be >= Q1D).

This seems to help with the hip compiler.

Use batching

5e9ba94

michaelmckinsey1 self-assigned this Jun 19, 2026

michaelmckinsey1 commented Jun 19, 2026

View reviewed changes

helloworld922 reviewed Jun 22, 2026

View reviewed changes

Refactor check

efaa790

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] `MASS3DPA` element batching#693

[WIP] `MASS3DPA` element batching#693
michaelmckinsey1 wants to merge 2 commits into
developfrom
mass3dpa-batching

michaelmckinsey1 commented Jun 19, 2026 •

edited

Loading

Uh oh!

michaelmckinsey1 Jun 19, 2026

Uh oh!

MrBurmark Jun 19, 2026

Uh oh!

helloworld922 Jun 22, 2026

Uh oh!

artv3 commented Jun 19, 2026

Uh oh!

MrBurmark commented Jun 19, 2026

Uh oh!

artv3 commented Jun 19, 2026

Uh oh!

helloworld922 Jun 22, 2026

Uh oh!

michaelmckinsey1 Jun 24, 2026

Uh oh!

helloworld922 Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

michaelmckinsey1 commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

michaelmckinsey1 Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

MrBurmark Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

helloworld922 Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

artv3 commented Jun 19, 2026

Uh oh!

MrBurmark commented Jun 19, 2026

Uh oh!

artv3 commented Jun 19, 2026

Uh oh!

helloworld922 Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

michaelmckinsey1 Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

helloworld922 Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michaelmckinsey1 commented Jun 19, 2026 •

edited

Loading