Release v0.9.2 · NVIDIA/MatX

New operator: interp

Other Additions:

Improvements to sparse support including new batched tri-diagonal solver
Automatic vectorization and ILP support
DLPack updated to 1.1
Many bug fixes

What's Changed

Fix partial any/all reduction by @simonbyrne in #959
interp1: add support for higher dimensional sample points and values by @simonbyrne in #963
Introduce DIA and SkewDIA format by @aartbik in #964
Refactor MATX_CUDA_CHECK to prevent multiple evaluation by @tmartin-gh in #957
Introduce DIA format factory method by @aartbik in #965
reformat sparse files with clang-format by @aartbik in #966
Implement DIA SpMV kernel by @aartbik in #967
Generalize SpMV from square to m x n DIA by @aartbik in #969
replace static_assert(false) with host-only THROW by @aartbik in #968
Generalize DIA to DIA-I and DIA-J by @aartbik in #972
Avoid name collision with cpu_set_t from sched.h by @tbensonatl in #971
Add axis argument to interp1. by @simonbyrne in #970
Add operator tests back by @cliffburdick in #977
clang-format on sparse tests by @aartbik in #973
Add SpMV test for DIA-I and DIA-J by @aartbik in #974
(re) enable all sparse tests by @aartbik in #979
Let X = solve(A, B) take X and B along rows by @aartbik in #981
Add tri-diagonal solve support by @aartbik in #982
update doc with latest DIA support by @aartbik in #983
minor sparse documentation refinement by @aartbik in #984
Updating Google Test by @cliffburdick in #985
Minor fix in UST level order for DIA by @aartbik in #986
Vectorization and ILP by @cliffburdick in #980
Fixing compile error with FFT conv by @cliffburdick in #989
Fixing another 12.9 compiler bug by @cliffburdick in #991
Removing unused parameter in lambda causing error on clang by @cliffburdick in #992
proper lvl2dim computation for add/sub by @aartbik in #994
add braces to if-then-else by @aartbik in #997
Avoid fmod become ambiguous once CCCL specializes it for extended floating point types by @miscco in #996
clang formatting by @aartbik in #998
implement batched tri-diagonal direct solve by @aartbik in #999
add streams to alloc/free in cusparse sequences by @aartbik in #1001
test for batched tri-diag direct solver by @aartbik in #1000
fix minor typos in comments by @aartbik in #1002
DLPack 1.1 update by @cliffburdick in #1004
Fix host compiler errors when using -Wall -Werror by @tmartin-gh in #1006
Fix ARM relocation trucation build errors by @dylan-eustice in #1008
Allocate pinned host memory instead of managed when managed isn't available by @cliffburdick in #1010
Added executor to cache by @cliffburdick in #1009
Remove template parameters in constructor by @cliffburdick in #1012
fix flipud for 1D tensors by @simonbyrne in #1011
Fix warnings in clang19 by @cliffburdick in #1015
Missing unit test syncs by @dylan-eustice in #1013
add convenience constructor for batched tri diag sparse tensor by @aartbik in #1019
Remove runtime checks on memory spaces by @aartbik in #1018
build each test file as a separate executable by @simonbyrne in #1017
use batched sparse solve for interp by @simonbyrne in #1016

New Contributors

@miscco made their first contribution in #996
@dylan-eustice made their first contribution in #1008

Full Changelog: v0.9.1...v0.9.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.2

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!