Merged
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3107 +/- ##
=======================================
Coverage 90.64% 90.64%
=======================================
Files 143 143
Lines 12191 12191
=======================================
Hits 11051 11051
Misses 1140 1140 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Contributor
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: ecdfc1a | Previous: af6961a | Ratio |
|---|---|---|---|
array/accumulate/Float32/1d |
101226 ns |
101502.5 ns |
1.00 |
array/accumulate/Float32/dims=1 |
76805 ns |
76629.5 ns |
1.00 |
array/accumulate/Float32/dims=1L |
1584907 ns |
1585691.5 ns |
1.00 |
array/accumulate/Float32/dims=2 |
143689.5 ns |
143840 ns |
1.00 |
array/accumulate/Float32/dims=2L |
657764.5 ns |
657866.5 ns |
1.00 |
array/accumulate/Int64/1d |
118674 ns |
119085 ns |
1.00 |
array/accumulate/Int64/dims=1 |
80363 ns |
79723 ns |
1.01 |
array/accumulate/Int64/dims=1L |
1694284 ns |
1694532 ns |
1.00 |
array/accumulate/Int64/dims=2 |
156487 ns |
156114 ns |
1.00 |
array/accumulate/Int64/dims=2L |
962187 ns |
961892 ns |
1.00 |
array/broadcast |
20316 ns |
20629 ns |
0.98 |
array/construct |
1258.4 ns |
1256.5 ns |
1.00 |
array/copy |
18088 ns |
17886 ns |
1.01 |
array/copyto!/cpu_to_gpu |
213698 ns |
212623 ns |
1.01 |
array/copyto!/gpu_to_cpu |
282892 ns |
283142 ns |
1.00 |
array/copyto!/gpu_to_gpu |
10914 ns |
10702 ns |
1.02 |
array/iteration/findall/bool |
134062 ns |
134713 ns |
1.00 |
array/iteration/findall/int |
148763 ns |
149574 ns |
0.99 |
array/iteration/findfirst/bool |
80671 ns |
81484 ns |
0.99 |
array/iteration/findfirst/int |
82634 ns |
83536 ns |
0.99 |
array/iteration/findmin/1d |
84122.5 ns |
85967.5 ns |
0.98 |
array/iteration/findmin/2d |
116483 ns |
116482 ns |
1.00 |
array/iteration/logical |
197879.5 ns |
199753 ns |
0.99 |
array/iteration/scalar |
66035 ns |
67105.5 ns |
0.98 |
array/permutedims/2d |
51838 ns |
52568 ns |
0.99 |
array/permutedims/3d |
52111 ns |
53057 ns |
0.98 |
array/permutedims/4d |
51081 ns |
51868.5 ns |
0.98 |
array/random/rand/Float32 |
13512 ns |
12842 ns |
1.05 |
array/random/rand/Int64 |
24884 ns |
25396 ns |
0.98 |
array/random/rand!/Float32 |
9984.333333333334 ns |
8414.666666666666 ns |
1.19 |
array/random/rand!/Int64 |
21681 ns |
21981 ns |
0.99 |
array/random/randn/Float32 |
42783 ns |
37382.5 ns |
1.14 |
array/random/randn!/Float32 |
30640 ns |
30580 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
34017 ns |
34346 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=1 |
40346.5 ns |
39918.5 ns |
1.01 |
array/reductions/mapreduce/Float32/dims=1L |
51223 ns |
51509 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=2 |
56366 ns |
56563 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
68860 ns |
69236 ns |
0.99 |
array/reductions/mapreduce/Int64/1d |
41464 ns |
42783 ns |
0.97 |
array/reductions/mapreduce/Int64/dims=1 |
42332.5 ns |
42869 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1L |
87456 ns |
87545 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2 |
59674 ns |
59325 ns |
1.01 |
array/reductions/mapreduce/Int64/dims=2L |
84576 ns |
85131 ns |
0.99 |
array/reductions/reduce/Float32/1d |
34268 ns |
34338 ns |
1.00 |
array/reductions/reduce/Float32/dims=1 |
39919 ns |
48895 ns |
0.82 |
array/reductions/reduce/Float32/dims=1L |
51431 ns |
51567 ns |
1.00 |
array/reductions/reduce/Float32/dims=2 |
56418 ns |
56622 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
69423 ns |
69625 ns |
1.00 |
array/reductions/reduce/Int64/1d |
41659 ns |
42697 ns |
0.98 |
array/reductions/reduce/Int64/dims=1 |
42000 ns |
47176.5 ns |
0.89 |
array/reductions/reduce/Int64/dims=1L |
87119 ns |
87361 ns |
1.00 |
array/reductions/reduce/Int64/dims=2 |
59568 ns |
59337 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
84307.5 ns |
84717 ns |
1.00 |
array/reverse/1d |
17870 ns |
18051 ns |
0.99 |
array/reverse/1dL |
68457 ns |
68628 ns |
1.00 |
array/reverse/1dL_inplace |
65709 ns |
65835 ns |
1.00 |
array/reverse/1d_inplace |
8469.333333333334 ns |
8676.333333333334 ns |
0.98 |
array/reverse/2d |
20545 ns |
20541 ns |
1.00 |
array/reverse/2dL |
72608 ns |
72496 ns |
1.00 |
array/reverse/2dL_inplace |
65808 ns |
65804 ns |
1.00 |
array/reverse/2d_inplace |
10005 ns |
10104 ns |
0.99 |
array/sorting/1d |
2735266 ns |
2734890.5 ns |
1.00 |
array/sorting/2d |
1068236 ns |
1069333 ns |
1.00 |
array/sorting/by |
3303740 ns |
3305442 ns |
1.00 |
cuda/synchronization/context/auto |
1120.8 ns |
1142.4 ns |
0.98 |
cuda/synchronization/context/blocking |
882.7142857142857 ns |
918.9428571428572 ns |
0.96 |
cuda/synchronization/context/nonblocking |
7518.8 ns |
7080.9 ns |
1.06 |
cuda/synchronization/stream/auto |
985.6666666666666 ns |
983.3846153846154 ns |
1.00 |
cuda/synchronization/stream/blocking |
806.5888888888888 ns |
819.060606060606 ns |
0.98 |
cuda/synchronization/stream/nonblocking |
7448.1 ns |
7951.1 ns |
0.94 |
integration/byval/reference |
143790 ns |
143938 ns |
1.00 |
integration/byval/slices=1 |
145763 ns |
145713 ns |
1.00 |
integration/byval/slices=2 |
284558 ns |
284653 ns |
1.00 |
integration/byval/slices=3 |
423114 ns |
422978 ns |
1.00 |
integration/cudadevrt |
102372 ns |
102487 ns |
1.00 |
integration/volumerhs |
23426831 ns |
23490453 ns |
1.00 |
kernel/indexing |
13147 ns |
13409 ns |
0.98 |
kernel/indexing_checked |
13847 ns |
14040 ns |
0.99 |
kernel/launch |
2011.5 ns |
2267.4444444444443 ns |
0.89 |
kernel/occupancy |
666.2919254658385 ns |
828.1463414634146 ns |
0.80 |
kernel/rand |
14327 ns |
14466 ns |
0.99 |
latency/import |
3835314320.5 ns |
3809615751 ns |
1.01 |
latency/precompile |
4581629321 ns |
4560767959.5 ns |
1.00 |
latency/ttfp |
4393995913 ns |
4401798424 ns |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Also serves as a CI run with CUDA 13.2 Update 1
x-ref JuliaPackaging/Yggdrasil#13526