Conversation
|
Your PR no longer requires formatting changes. Thank you for your contribution! |
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: f0f26be | Previous: d223d48 | Ratio |
|---|---|---|---|
latency/precompile |
45875773921 ns |
45641120785 ns |
1.01 |
latency/ttfp |
6494221270.5 ns |
6558560140 ns |
0.99 |
latency/import |
3171040795 ns |
3191866733 ns |
0.99 |
integration/volumerhs |
9628852.5 ns |
9611544 ns |
1.00 |
integration/byval/slices=1 |
147055 ns |
146796 ns |
1.00 |
integration/byval/slices=3 |
425122 ns |
425150 ns |
1.00 |
integration/byval/reference |
145016 ns |
145127 ns |
1.00 |
integration/byval/slices=2 |
286279 ns |
286038 ns |
1.00 |
integration/cudadevrt |
103422 ns |
103471 ns |
1.00 |
kernel/indexing |
14256 ns |
14169 ns |
1.01 |
kernel/indexing_checked |
14669 ns |
14963 ns |
0.98 |
kernel/occupancy |
673.1834319526628 ns |
657.8086419753087 ns |
1.02 |
kernel/launch |
2082.95 ns |
2072.7 ns |
1.00 |
kernel/rand |
14702 ns |
16123 ns |
0.91 |
array/reverse/1d |
19141 ns |
19951 ns |
0.96 |
array/reverse/2d |
25088 ns |
25142 ns |
1.00 |
array/reverse/1d_inplace |
11327 ns |
11456 ns |
0.99 |
array/reverse/2d_inplace |
12925 ns |
13247.5 ns |
0.98 |
array/copy |
20990 ns |
21071 ns |
1.00 |
array/iteration/findall/int |
158173 ns |
157380 ns |
1.01 |
array/iteration/findall/bool |
139025 ns |
138848.5 ns |
1.00 |
array/iteration/findfirst/int |
154294.5 ns |
154329 ns |
1.00 |
array/iteration/findfirst/bool |
155161 ns |
154809.5 ns |
1.00 |
array/iteration/scalar |
72703 ns |
70606 ns |
1.03 |
array/iteration/logical |
213855.5 ns |
214919.5 ns |
1.00 |
array/iteration/findmin/1d |
41651 ns |
41444 ns |
1.00 |
array/iteration/findmin/2d |
93829.5 ns |
94472 ns |
0.99 |
array/reductions/reduce/1d |
43150.5 ns |
35629 ns |
1.21 |
array/reductions/reduce/2d |
51433 ns |
50066 ns |
1.03 |
array/reductions/mapreduce/1d |
39035 ns |
33606 ns |
1.16 |
array/reductions/mapreduce/2d |
48288 ns |
43968.5 ns |
1.10 |
array/broadcast |
20797 ns |
20847 ns |
1.00 |
array/copyto!/gpu_to_gpu |
11879 ns |
11929 ns |
1.00 |
array/copyto!/cpu_to_gpu |
208979 ns |
209164.5 ns |
1.00 |
array/copyto!/gpu_to_cpu |
243581 ns |
243460 ns |
1.00 |
array/accumulate/1d |
108404 ns |
109349 ns |
0.99 |
array/accumulate/2d |
79788.5 ns |
80109.5 ns |
1.00 |
array/construct |
1310.5 ns |
1250 ns |
1.05 |
array/random/randn/Float32 |
44854 ns |
43560 ns |
1.03 |
array/random/randn!/Float32 |
26417 ns |
26514 ns |
1.00 |
array/random/rand!/Int64 |
27264 ns |
27315 ns |
1.00 |
array/random/rand!/Float32 |
8727.666666666666 ns |
8798 ns |
0.99 |
array/random/rand/Int64 |
29920 ns |
30086 ns |
0.99 |
array/random/rand/Float32 |
13011 ns |
13099 ns |
0.99 |
array/permutedims/4d |
60894 ns |
61425.5 ns |
0.99 |
array/permutedims/2d |
55591 ns |
55688 ns |
1.00 |
array/permutedims/3d |
56320 ns |
56210.5 ns |
1.00 |
array/sorting/1d |
2776432 ns |
2777197.5 ns |
1.00 |
array/sorting/by |
3367294 ns |
3368328 ns |
1.00 |
array/sorting/2d |
1084782 ns |
1084987 ns |
1.00 |
cuda/synchronization/stream/auto |
1044.6 ns |
1036.1 ns |
1.01 |
cuda/synchronization/stream/nonblocking |
6568.8 ns |
6554.8 ns |
1.00 |
cuda/synchronization/stream/blocking |
841.1627906976744 ns |
807.505376344086 ns |
1.04 |
cuda/synchronization/context/auto |
1186.5 ns |
1182.7 ns |
1.00 |
cuda/synchronization/context/nonblocking |
6806.8 ns |
6792 ns |
1.00 |
cuda/synchronization/context/blocking |
928.1395348837209 ns |
912.046511627907 ns |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
|
LGTM, but CI failures are related. |
|
Now that I look at it, I think these methods are just invalid. You can't supply a |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2700 +/- ##
==========================================
+ Coverage 83.51% 83.60% +0.08%
==========================================
Files 153 153
Lines 13606 13592 -14
==========================================
Hits 11363 11363
+ Misses 2243 2229 -14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@kshyatt @maleadt I explained why I did that in a previous PR: I can use these constructors to perform the symbolic analyis for SpMV / SPMM / SPSV and SPSM as well as the allocation of the buffers for a given number of right-hand sides. |
|
Update: It was these PR where I explained why I did that: |
|
It probably broke other of our optimization packages like CUSOLVERRF.jl because we preallocate the buffers for sparse triangular solves. @maleadt Is it fine to add them back in CUDA.jl and do a new small minor release 5.7.2? |
|
Sorry about that! Definitely add them back - is there a way we can add tests too? |
|
It would also be good to add a comment inline with the restored methods explaining the above, sorry again, 100% my fault |
|
Thanks @kshyatt !! |
No description provided.