Skip to content

Add env-based decomposer overloads for DeviceRadixSort descending variants#8099

Open
gonidelis wants to merge 17 commits intoNVIDIA:mainfrom
gonidelis:radix_decomposer_descending
Open

Add env-based decomposer overloads for DeviceRadixSort descending variants#8099
gonidelis wants to merge 17 commits intoNVIDIA:mainfrom
gonidelis:radix_decomposer_descending

Conversation

@gonidelis
Copy link
Copy Markdown
Member

closes #7998

@gonidelis gonidelis requested a review from a team as a code owner March 19, 2026 03:24
@gonidelis gonidelis requested a review from elstehle March 19, 2026 03:24
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 19, 2026
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Mar 19, 2026
@gonidelis gonidelis marked this pull request as draft March 19, 2026 03:30
@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot bot commented Mar 19, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cccl-authenticator-app cccl-authenticator-app bot moved this from In Review to In Progress in CCCL Mar 19, 2026
@github-actions

This comment has been minimized.

@gonidelis gonidelis force-pushed the radix_decomposer_descending branch from 36988ec to c4ea000 Compare March 25, 2026 11:26
@gonidelis gonidelis marked this pull request as ready for review March 25, 2026 11:26
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Mar 25, 2026
@github-actions

This comment has been minimized.

@gonidelis gonidelis force-pushed the radix_decomposer_descending branch from d359c22 to 7ef72e8 Compare March 25, 2026 23:03
@gonidelis gonidelis enabled auto-merge (squash) March 25, 2026 23:06
@gonidelis gonidelis force-pushed the radix_decomposer_descending branch from 7ef72e8 to e2c26b0 Compare March 25, 2026 23:52
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@bernhardmgruber bernhardmgruber force-pushed the radix_decomposer_descending branch from 5c3320f to c19c993 Compare April 9, 2026 07:48
@bernhardmgruber
Copy link
Copy Markdown
Contributor

I rebased the PR to resolve conflicts.

"DecomposerT must be a callable object returning a tuple of references to "
"arithmetic types");

if constexpr (decomposer_check_t::value)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the static_assert(check); if constexpr(check){} practice good? nvhpc hates the unreachable path after if constexpr pushing a nit-fix but why not drop the if constexpr through and through since static_assert guard it anyways?

Suggested change
if constexpr (decomposer_check_t::value)
static_assert(decomposer_check_t::value,
"DecomposerT must be a callable object returning a tuple of references to "
"arithmetic types");
return detail::radix_sort::dispatch<Order>(

is that wrong?

Copy link
Copy Markdown
Contributor

@bernhardmgruber bernhardmgruber Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the static_assert(check); if constexpr(check){} practice good?

It isn't IMO.

why not drop the if constexpr through

The answer to that questions is 2m in front of you once you are back in HQ ;)

That person explained to me that by branching on the static assert condition, we can prevent the compiler from emitting follow-up errors, in case the assertion fails. This is supposed to reduce the output in case of compilation errors and should thus be user friendly. I don't buy this argument, because so far it has confused every developer and the error message in case of a compilation failure is long regardless.

I think the best course of action is to place a requires decomposer_check_t::value at the public API. But that's for a separate PR.

Feel free to remove the if constexpr or leave it. I have learned to ignore it.

@github-actions

This comment has been minimized.

@bernhardmgruber bernhardmgruber force-pushed the radix_decomposer_descending branch from c439d4b to 4980cc5 Compare April 10, 2026 22:17
@bernhardmgruber
Copy link
Copy Markdown
Contributor

I rebased to resolve the conflict.

Comment on lines +460 to +473
auto error = cub::DeviceRadixSort::SortKeysDescending(
keys_in.data().get(),
keys_out.data().get(),
static_cast<int>(keys_in.size()),
keys_decomposer_t{},
0,
sizeof(int) * 8,
stream_ref);

// example-end radix-sort-keys-descending-decomposer-bits-env
stream.sync();

REQUIRE(error == cudaSuccess);
thrust::device_vector<custom_key_t> expected{{9}, {8}, {7}, {6}, {5}, {3}, {0}};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: I think we should show the expected outcome in the documentation example:

Suggested change
auto error = cub::DeviceRadixSort::SortKeysDescending(
keys_in.data().get(),
keys_out.data().get(),
static_cast<int>(keys_in.size()),
keys_decomposer_t{},
0,
sizeof(int) * 8,
stream_ref);
// example-end radix-sort-keys-descending-decomposer-bits-env
stream.sync();
REQUIRE(error == cudaSuccess);
thrust::device_vector<custom_key_t> expected{{9}, {8}, {7}, {6}, {5}, {3}, {0}};
auto error = cub::DeviceRadixSort::SortKeysDescending(
keys_in.data().get(),
keys_out.data().get(),
static_cast<int>(keys_in.size()),
keys_decomposer_t{},
0,
sizeof(int) * 8,
stream_ref);
thrust::device_vector<custom_key_t> expected{{9}, {8}, {7}, {6}, {5}, {3}, {0}};
// example-end radix-sort-keys-descending-decomposer-bits-env
stream.sync();
REQUIRE(error == cudaSuccess);

@github-actions
Copy link
Copy Markdown
Contributor

🥳 CI Workflow Results

🟩 Finished in 1h 31m: Pass: 100%/269 | Total: 8d 11h | Max: 1h 23m | Hits: 74%/177147

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Add env-based API for decomposer overloads of cub::DeviceRadixSort

2 participants