v0.9.2
New operator: interp
Other Additions:
- Improvements to sparse support including new batched tri-diagonal solver
- Automatic vectorization and ILP support
- DLPack updated to 1.1
- Many bug fixes
What's Changed
- Fix partial any/all reduction by @simonbyrne in #959
- interp1: add support for higher dimensional sample points and values by @simonbyrne in #963
- Introduce DIA and SkewDIA format by @aartbik in #964
- Refactor MATX_CUDA_CHECK to prevent multiple evaluation by @tmartin-gh in #957
- Introduce DIA format factory method by @aartbik in #965
- reformat sparse files with clang-format by @aartbik in #966
- Implement DIA SpMV kernel by @aartbik in #967
- Generalize SpMV from square to m x n DIA by @aartbik in #969
- replace static_assert(false) with host-only THROW by @aartbik in #968
- Generalize DIA to DIA-I and DIA-J by @aartbik in #972
- Avoid name collision with cpu_set_t from sched.h by @tbensonatl in #971
- Add axis argument to interp1. by @simonbyrne in #970
- Add operator tests back by @cliffburdick in #977
- clang-format on sparse tests by @aartbik in #973
- Add SpMV test for DIA-I and DIA-J by @aartbik in #974
- (re) enable all sparse tests by @aartbik in #979
- Let X = solve(A, B) take X and B along rows by @aartbik in #981
- Add tri-diagonal solve support by @aartbik in #982
- update doc with latest DIA support by @aartbik in #983
- minor sparse documentation refinement by @aartbik in #984
- Updating Google Test by @cliffburdick in #985
- Minor fix in UST level order for DIA by @aartbik in #986
- Vectorization and ILP by @cliffburdick in #980
- Fixing compile error with FFT conv by @cliffburdick in #989
- Fixing another 12.9 compiler bug by @cliffburdick in #991
- Removing unused parameter in lambda causing error on clang by @cliffburdick in #992
- proper lvl2dim computation for add/sub by @aartbik in #994
- add braces to if-then-else by @aartbik in #997
- Avoid
fmodbecome ambiguous once CCCL specializes it for extended floating point types by @miscco in #996 - clang formatting by @aartbik in #998
- implement batched tri-diagonal direct solve by @aartbik in #999
- add streams to alloc/free in cusparse sequences by @aartbik in #1001
- test for batched tri-diag direct solver by @aartbik in #1000
- fix minor typos in comments by @aartbik in #1002
- DLPack 1.1 update by @cliffburdick in #1004
- Fix host compiler errors when using -Wall -Werror by @tmartin-gh in #1006
- Fix ARM relocation trucation build errors by @dylan-eustice in #1008
- Allocate pinned host memory instead of managed when managed isn't available by @cliffburdick in #1010
- Added executor to cache by @cliffburdick in #1009
- Remove template parameters in constructor by @cliffburdick in #1012
- fix flipud for 1D tensors by @simonbyrne in #1011
- Fix warnings in clang19 by @cliffburdick in #1015
- Missing unit test syncs by @dylan-eustice in #1013
- add convenience constructor for batched tri diag sparse tensor by @aartbik in #1019
- Remove runtime checks on memory spaces by @aartbik in #1018
- build each test file as a separate executable by @simonbyrne in #1017
- use batched sparse solve for interp by @simonbyrne in #1016
New Contributors
- @miscco made their first contribution in #996
- @dylan-eustice made their first contribution in #1008
Full Changelog: v0.9.1...v0.9.2