Skip to content

Export gputhrow#3106

Closed
termi-official wants to merge 1 commit intoJuliaGPU:masterfrom
termi-official:do/export-gputhrow-v6
Closed

Export gputhrow#3106
termi-official wants to merge 1 commit intoJuliaGPU:masterfrom
termi-official:do/export-gputhrow-v6

Conversation

@termi-official
Copy link
Copy Markdown
Contributor

@termi-official termi-official commented Apr 16, 2026

@gputhrow was exported on v5

@maleadt
Copy link
Copy Markdown
Member

maleadt commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: 45b627c Previous: af6961a Ratio
array/accumulate/Float32/1d 101055 ns 101502.5 ns 1.00
array/accumulate/Float32/dims=1 76936 ns 76629.5 ns 1.00
array/accumulate/Float32/dims=1L 1585505 ns 1585691.5 ns 1.00
array/accumulate/Float32/dims=2 144151 ns 143840 ns 1.00
array/accumulate/Float32/dims=2L 658023 ns 657866.5 ns 1.00
array/accumulate/Int64/1d 118692 ns 119085 ns 1.00
array/accumulate/Int64/dims=1 79686 ns 79723 ns 1.00
array/accumulate/Int64/dims=1L 1694741 ns 1694532 ns 1.00
array/accumulate/Int64/dims=2 156062 ns 156114 ns 1.00
array/accumulate/Int64/dims=2L 961855 ns 961892 ns 1.00
array/broadcast 20347 ns 20629 ns 0.99
array/construct 1287.6 ns 1256.5 ns 1.02
array/copy 18146 ns 17886 ns 1.01
array/copyto!/cpu_to_gpu 214789 ns 212623 ns 1.01
array/copyto!/gpu_to_cpu 285256 ns 283142 ns 1.01
array/copyto!/gpu_to_gpu 10932 ns 10702 ns 1.02
array/iteration/findall/bool 135084 ns 134713 ns 1.00
array/iteration/findall/int 150895 ns 149574 ns 1.01
array/iteration/findfirst/bool 82571 ns 81484 ns 1.01
array/iteration/findfirst/int 84149.5 ns 83536 ns 1.01
array/iteration/findmin/1d 85286.5 ns 85967.5 ns 0.99
array/iteration/findmin/2d 117254 ns 116482 ns 1.01
array/iteration/logical 200524.5 ns 199753 ns 1.00
array/iteration/scalar 67866 ns 67105.5 ns 1.01
array/permutedims/2d 52533 ns 52568 ns 1.00
array/permutedims/3d 52716.5 ns 53057 ns 0.99
array/permutedims/4d 51875 ns 51868.5 ns 1.00
array/random/rand/Float32 13217 ns 12842 ns 1.03
array/random/rand/Int64 25251 ns 25396 ns 0.99
array/random/rand!/Float32 9853.5 ns 8414.666666666666 ns 1.17
array/random/rand!/Int64 21914 ns 21981 ns 1.00
array/random/randn/Float32 43758 ns 37382.5 ns 1.17
array/random/randn!/Float32 31002 ns 30580 ns 1.01
array/reductions/mapreduce/Float32/1d 35214.5 ns 34346 ns 1.03
array/reductions/mapreduce/Float32/dims=1 39305 ns 39918.5 ns 0.98
array/reductions/mapreduce/Float32/dims=1L 51269 ns 51509 ns 1.00
array/reductions/mapreduce/Float32/dims=2 56724.5 ns 56563 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 69541 ns 69236 ns 1.00
array/reductions/mapreduce/Int64/1d 43334 ns 42783 ns 1.01
array/reductions/mapreduce/Int64/dims=1 43236.5 ns 42869 ns 1.01
array/reductions/mapreduce/Int64/dims=1L 87164 ns 87545 ns 1.00
array/reductions/mapreduce/Int64/dims=2 59640 ns 59325 ns 1.01
array/reductions/mapreduce/Int64/dims=2L 85148 ns 85131 ns 1.00
array/reductions/reduce/Float32/1d 35062 ns 34338 ns 1.02
array/reductions/reduce/Float32/dims=1 42837 ns 48895 ns 0.88
array/reductions/reduce/Float32/dims=1L 51460 ns 51567 ns 1.00
array/reductions/reduce/Float32/dims=2 56801 ns 56622 ns 1.00
array/reductions/reduce/Float32/dims=2L 70088 ns 69625 ns 1.01
array/reductions/reduce/Int64/1d 43395 ns 42697 ns 1.02
array/reductions/reduce/Int64/dims=1 42548 ns 47176.5 ns 0.90
array/reductions/reduce/Int64/dims=1L 87073 ns 87361 ns 1.00
array/reductions/reduce/Int64/dims=2 59960.5 ns 59337 ns 1.01
array/reductions/reduce/Int64/dims=2L 85238 ns 84717 ns 1.01
array/reverse/1d 17777 ns 18051 ns 0.98
array/reverse/1dL 68336 ns 68628 ns 1.00
array/reverse/1dL_inplace 65822 ns 65835 ns 1.00
array/reverse/1d_inplace 10346.166666666668 ns 8676.333333333334 ns 1.19
array/reverse/2d 20928 ns 20541 ns 1.02
array/reverse/2dL 72881 ns 72496 ns 1.01
array/reverse/2dL_inplace 65775 ns 65804 ns 1.00
array/reverse/2d_inplace 10414 ns 10104 ns 1.03
array/sorting/1d 2736888 ns 2734890.5 ns 1.00
array/sorting/2d 1069374 ns 1069333 ns 1.00
array/sorting/by 3306761 ns 3305442 ns 1.00
cuda/synchronization/context/auto 1174 ns 1142.4 ns 1.03
cuda/synchronization/context/blocking 945.3255813953489 ns 918.9428571428572 ns 1.03
cuda/synchronization/context/nonblocking 7649 ns 7080.9 ns 1.08
cuda/synchronization/stream/auto 974.7777777777778 ns 983.3846153846154 ns 0.99
cuda/synchronization/stream/blocking 791.6862745098039 ns 819.060606060606 ns 0.97
cuda/synchronization/stream/nonblocking 7291.200000000001 ns 7951.1 ns 0.92
integration/byval/reference 143814 ns 143938 ns 1.00
integration/byval/slices=1 145593 ns 145713 ns 1.00
integration/byval/slices=2 284337 ns 284653 ns 1.00
integration/byval/slices=3 422732 ns 422978 ns 1.00
integration/cudadevrt 102289 ns 102487 ns 1.00
integration/volumerhs 23451596 ns 23490453 ns 1.00
kernel/indexing 13244 ns 13409 ns 0.99
kernel/indexing_checked 14046 ns 14040 ns 1.00
kernel/launch 2248.8888888888887 ns 2267.4444444444443 ns 0.99
kernel/occupancy 702.8958333333334 ns 828.1463414634146 ns 0.85
kernel/rand 14343 ns 14466 ns 0.99
latency/import 3824776930.5 ns 3809615751 ns 1.00
latency/precompile 4599001518 ns 4560767959.5 ns 1.01
latency/ttfp 4432304043.5 ns 4401798424 ns 1.01

This comment was automatically generated by workflow using github-action-benchmark.

@termi-official
Copy link
Copy Markdown
Contributor Author

Oh, right. Sorry. I was really just took CUDA: @gputhrow from somewhere and was assuming it was part of the public API.

We would like to do the following in our package: @device_override @noinline Ferrite.throw_detJ_not_pos(detJ) = @gputhrow("ArgumentError", "det(J) is not positive. Please check the value on CPU.") to still give reasonable error messages. Would it be possible to add @gputhrow to the public API or should we do something else?

@maleadt
Copy link
Copy Markdown
Member

maleadt commented Apr 17, 2026

The functionality is too much tied to internal state for it to become public, I think. Especially because I plan to rework exception handling, albeit at some undetermined point in the future. Would it be fine if you lose some of the reporting accuracy and simply do what we used to do before:

@cuprintln "ERROR: " $(args...) "."
throw(nothong)

@termi-official
Copy link
Copy Markdown
Contributor Author

I see. We will stick with whatever is currently recommended.

@termi-official termi-official changed the title Export gputhrow again Export gputhrow Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants