Skip to content

Added Base.similar methods for CuSparseMatrixCOO and BSR#3114

Open
rainerrodrigues wants to merge 4 commits intoJuliaGPU:masterfrom
rainerrodrigues:add-sparse-similar
Open

Added Base.similar methods for CuSparseMatrixCOO and BSR#3114
rainerrodrigues wants to merge 4 commits intoJuliaGPU:masterfrom
rainerrodrigues:add-sparse-similar

Conversation

@rainerrodrigues
Copy link
Copy Markdown

This PR adds the missing Base.similar methods for CuSparseMatrixCOO and CuSparseMatrixBSR, allowing them to fallback gracefully without converting to dense CPU arrays.

Fixes #3061
Fixes #3055

Comment thread lib/cusparse/src/array.jl Outdated
@kshyatt
Copy link
Copy Markdown
Member

kshyatt commented Apr 21, 2026

Also, can some tests be added?

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: f08a059 Previous: e4ac81a Ratio
array/accumulate/Float32/1d 101247 ns 101073 ns 1.00
array/accumulate/Float32/dims=1 76743 ns 76196 ns 1.01
array/accumulate/Float32/dims=1L 1585917 ns 1585166 ns 1.00
array/accumulate/Float32/dims=2 143733 ns 143846 ns 1.00
array/accumulate/Float32/dims=2L 658278 ns 657343 ns 1.00
array/accumulate/Int64/1d 118799 ns 118428 ns 1.00
array/accumulate/Int64/dims=1 80961 ns 79813 ns 1.01
array/accumulate/Int64/dims=1L 1707513 ns 1706332.5 ns 1.00
array/accumulate/Int64/dims=2 156699.5 ns 155958.5 ns 1.00
array/accumulate/Int64/dims=2L 962521 ns 961689 ns 1.00
array/broadcast 20622 ns 20223 ns 1.02
array/construct 1249.6 ns 1268 ns 0.99
array/copy 18153 ns 18010.5 ns 1.01
array/copyto!/cpu_to_gpu 214356 ns 214386 ns 1.00
array/copyto!/gpu_to_cpu 283694 ns 282599 ns 1.00
array/copyto!/gpu_to_gpu 10887 ns 10725 ns 1.02
array/iteration/findall/bool 135104 ns 133957 ns 1.01
array/iteration/findall/int 150112 ns 148817 ns 1.01
array/iteration/findfirst/bool 81558 ns 80695 ns 1.01
array/iteration/findfirst/int 82867 ns 82681 ns 1.00
array/iteration/findmin/1d 83372 ns 85081 ns 0.98
array/iteration/findmin/2d 116997 ns 116308 ns 1.01
array/iteration/logical 200714.5 ns 196867 ns 1.02
array/iteration/scalar 66145 ns 66869 ns 0.99
array/permutedims/2d 52371.5 ns 51940.5 ns 1.01
array/permutedims/3d 52407 ns 52252 ns 1.00
array/permutedims/4d 52082.5 ns 51176 ns 1.02
array/random/rand/Float32 13014 ns 13404 ns 0.97
array/random/rand/Int64 25431 ns 24711 ns 1.03
array/random/rand!/Float32 10177.666666666666 ns 10187 ns 1.00
array/random/rand!/Int64 21989 ns 21588 ns 1.02
array/random/randn/Float32 37480 ns 43197 ns 0.87
array/random/randn!/Float32 30651 ns 30898 ns 0.99
array/reductions/mapreduce/Float32/1d 33980 ns 34650 ns 0.98
array/reductions/mapreduce/Float32/dims=1 40495.5 ns 39535.5 ns 1.02
array/reductions/mapreduce/Float32/dims=1L 51393.5 ns 51174 ns 1.00
array/reductions/mapreduce/Float32/dims=2 56615 ns 56317 ns 1.01
array/reductions/mapreduce/Float32/dims=2L 69345 ns 69320 ns 1.00
array/reductions/mapreduce/Int64/1d 42363 ns 42198 ns 1.00
array/reductions/mapreduce/Int64/dims=1 42214 ns 41733 ns 1.01
array/reductions/mapreduce/Int64/dims=1L 87299 ns 86905 ns 1.00
array/reductions/mapreduce/Int64/dims=2 59569 ns 59221 ns 1.01
array/reductions/mapreduce/Int64/dims=2L 84785.5 ns 84434 ns 1.00
array/reductions/reduce/Float32/1d 34384 ns 34198.5 ns 1.01
array/reductions/reduce/Float32/dims=1 40444 ns 39152.5 ns 1.03
array/reductions/reduce/Float32/dims=1L 51476 ns 51208.5 ns 1.01
array/reductions/reduce/Float32/dims=2 56764 ns 56419 ns 1.01
array/reductions/reduce/Float32/dims=2L 69652 ns 69565 ns 1.00
array/reductions/reduce/Int64/1d 42501 ns 42368 ns 1.00
array/reductions/reduce/Int64/dims=1 42020 ns 50017 ns 0.84
array/reductions/reduce/Int64/dims=1L 87188 ns 86919 ns 1.00
array/reductions/reduce/Int64/dims=2 59606.5 ns 59687 ns 1.00
array/reductions/reduce/Int64/dims=2L 84660.5 ns 84484 ns 1.00
array/reverse/1d 18051 ns 17716 ns 1.02
array/reverse/1dL 68725 ns 68268 ns 1.01
array/reverse/1dL_inplace 65738 ns 65642 ns 1.00
array/reverse/1d_inplace 8549.333333333334 ns 10197.333333333334 ns 0.84
array/reverse/2d 21092 ns 20523 ns 1.03
array/reverse/2dL 73335 ns 72523 ns 1.01
array/reverse/2dL_inplace 65750 ns 65706 ns 1.00
array/reverse/2d_inplace 9950 ns 9831 ns 1.01
array/sorting/1d 2736444 ns 2735407.5 ns 1.00
array/sorting/2d 1069601 ns 1068528 ns 1.00
array/sorting/by 3306636 ns 3304139 ns 1.00
cuda/synchronization/context/auto 1131.6 ns 1165.7 ns 0.97
cuda/synchronization/context/blocking 920.377358490566 ns 876.8679245283018 ns 1.05
cuda/synchronization/context/nonblocking 7005.6 ns 6853.1 ns 1.02
cuda/synchronization/stream/auto 981.1875 ns 1035.7142857142858 ns 0.95
cuda/synchronization/stream/blocking 802.4257425742575 ns 789.9066666666666 ns 1.02
cuda/synchronization/stream/nonblocking 7178.8 ns 7513.5 ns 0.96
integration/byval/reference 143915 ns 143670 ns 1.00
integration/byval/slices=1 145772 ns 145632 ns 1.00
integration/byval/slices=2 284572 ns 284397 ns 1.00
integration/byval/slices=3 422978 ns 423006 ns 1.00
integration/cudadevrt 102395 ns 102385 ns 1.00
integration/volumerhs 23468528 ns 23431934.5 ns 1.00
kernel/indexing 13234 ns 13020 ns 1.02
kernel/indexing_checked 13957 ns 13828 ns 1.01
kernel/launch 2182.3333333333335 ns 2128.1111111111113 ns 1.03
kernel/occupancy 679.6064516129032 ns 701.7571428571429 ns 0.97
kernel/rand 16515 ns 14086 ns 1.17
latency/import 77746111924 ns 3826333658 ns 20.32
latency/precompile 8301636095.5 ns 4608074622.5 ns 1.80
latency/ttfp 78328318500 ns 4404025901.5 ns 17.79

This comment was automatically generated by workflow using github-action-benchmark.

Comment thread lib/cusparse/src/array.jl
Copy link
Copy Markdown
Author

@rainerrodrigues rainerrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshyatt Hi, can you check if this is suitable and extensive enough for testing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CUSPARSE] Missing appropriate similar methods Missing sparse array methods for CuSparseMatrixCOO and CuSparseMatrixBSR

2 participants