v0.7.0
Features
- Convert libcudacxx to CCCL by @cliffburdick in #501
- Add PreRun and tests for at/clone/diag operators by @tbensonatl in #502
- Add explicit FFT length to fft_conv example by @tbensonatl in #503
- Add Pre/PostRun support for collapse, concat ops by @tbensonatl in #506
- polyval operator by @cliffburdick in #508
- Optimize resample poly kernels by @tbensonatl in #512
- Allow negative indexing on slices by @cliffburdick in #516
- Automatically publish docs to GH Pages on merge to main by @tmartin-gh in #520
- Add configurable precision support of
print(). by @AtomicVar in #521 - Make matxHalf trivially copyable by @tbensonatl in #513
- Added operator for matvec by @cliffburdick in #514
- New rapids and nvbench by @cliffburdick in #529
Fixes
- Add FFT1D tensor size checks by @tbensonatl in #499
- Fix errors which caused some unit tests failed to compile. by @AtomicVar in #504
- Fix upsample output size by @cliffburdick in #507
- removing print characters accidently left behind by @tylera-nvidia in #510
- Renamed host executor and prepared for multi-threaded additions by @cliffburdick in #511
- removing old hardcoded limit for repmat rank size by @tylera-nvidia in #515
- Avoid async alloc in some Cholesky decomp cases by @tbensonatl in #517
- Workaround for maybe_unused parse bug in old gcc by @tbensonatl in #522
- Fix matvec output dims to match A rather than B by @tbensonatl in #523
- Remove CUDA system include by @cliffburdick in #525
- Zero-initialize batches field in CUB params by @tbensonatl in #527
- Fixing host include guard on resample poly by @cliffburdick in #528
- Update device.h for host compiler by @cliffburdick in #530
- Made allocator an inline function by @cliffburdick in #532
- Build and publish documentation on merge to main by @tmartin-gh in #533
- Remove doxygen parameter to match tensor_t constructor signature by @tmartin-gh in #534
- Update iterator.h by @cliffburdick in #536
- Update Bug Report Issue Template by @AtomicVar in #539
- Fix CCCL libcudacxx path by @cliffburdick in #537
- Check matmul types and error at compile-time if the backend doesn't support them by @cliffburdick in #540
- Fix batched cov transform by @tbensonatl in #541
- Update caching for transforms to fixing all leaks reported by compute-sanitizer by @cliffburdick in #542
- Update docs for v0.7.0 by @cliffburdick in #544
Full Changelog: v0.6.0...v0.7.0