Skip to content

Commit 7b69822

Browse files
committed
Updating docs and other files for release v0.6.0
1 parent 539c1b7 commit 7b69822

File tree

259 files changed

+13584
-4801
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

259 files changed

+13584
-4801
lines changed

CITATION.cff

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,6 @@ authors:
1111
given-names: "Adam"
1212
orcid: "https://orcid.org/0000-0001-9690-6357"
1313
title: "MatX Primitives Library for GPU-Accelerated Numerical Computing in C++"
14-
version: 0.1.0
15-
date-released: 2021-10-26
14+
version: 0.6.0
15+
date-released: 2023-10-02
1616
url: "https://github.com/NVIDIA/matx"

CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ endif()
5555
project(MATX
5656
LANGUAGES CUDA CXX
5757
DESCRIPTION "A modern and efficient header-only C++ library for numerical computing on GPU"
58-
VERSION 0.5.0
58+
VERSION 0.6.0
5959
HOMEPAGE_URL "https://github.com/NVIDIA/MatX")
6060

6161
if (NOT CMAKE_CUDA_ARCHITECTURES)

README.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,17 @@ We provide a variety of training materials and examples to quickly learn the Mat
193193
- Finally, for new MatX developers, browsing the [example applications](examples) can provide familarity with the API and best practices.
194194

195195
## Release Major Features
196+
*v0.6.0*:
197+
- Breaking changes
198+
* This marks the first release of using "transforms as operators". This allows transforms to be used in any operator expression, whereas the previous release required them to be on separate lines. For an example, please see: https://nvidia.github.io/MatX/basics/fusion.html. This also causes a breaking change with transform usage. Converting to the new format is as simple as moving the function parameters. For example: `matmul(C, A, B, stream);` becomes `(C = matmul(A,B)).run(stream);`.
199+
- Features
200+
* Polyphase channelizer
201+
* Many new operators, including upsample, downsample, pwelch, overlap, at, etc
202+
* Added more lvalue semantics for operators based on view manipulation
203+
- Bug fixes
204+
* Fixed cache issues
205+
* Fixed stride = 0 in matmul
206+
196207
*v0.5.0*:
197208
* Polyphase resampler
198209
* Documentation overhaul with examples for each function
@@ -205,15 +216,6 @@ We provide a variety of training materials and examples to quickly learn the Mat
205216
* 16-bit float reductions
206217
* Output iterator support in CUB
207218

208-
*v0.3.0*:
209-
* Many new operators, including `flatten`, `remap`, `lcollapse`. `rcollapse`, `fmod`, `clone`, `slice`
210-
* Extended N-D tensor support to more functions
211-
* Allow operators on reduction inputs
212-
* g++11 support
213-
* NVTX support
214-
* Many, many bug fixes
215-
216-
217219
## Discussions
218220
We have an open discussions board [here](https://github.com/NVIDIA/MatX/discussions). We encourage any questions about the library to be posted here for other users to learn from and read through.
219221

docs/_sources/api/creation/tensors/make.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,11 @@ Return by Value
1616
.. doxygenfunction:: make_tensor( TensorType &tensor, const index_t (&shape)[TensorType::Rank()], matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
1717
.. doxygenfunction:: make_tensor( ShapeType &&shape, matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
1818
.. doxygenfunction:: make_tensor( TensorType &tensor, ShapeType &&shape, matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
19-
.. doxygenfunction:: make_tensor( matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
2019
.. doxygenfunction:: make_tensor( TensorType &tensor, matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
2120
.. doxygenfunction:: make_tensor( T *data, const index_t (&shape)[RANK], bool owning = false)
2221
.. doxygenfunction:: make_tensor( TensorType &tensor, typename TensorType::scalar_type *data, const index_t (&shape)[TensorType::Rank()], bool owning = false)
2322
.. doxygenfunction:: make_tensor( T *data, ShapeType &&shape, bool owning = false)
2423
.. doxygenfunction:: make_tensor( TensorType &tensor, typename TensorType::scalar_type *data, typename TensorType::shape_container &&shape, bool owning = false)
25-
.. doxygenfunction:: make_tensor( T *ptr, bool owning = false)
2624
.. doxygenfunction:: make_tensor( TensorType &tensor, typename TensorType::scalar_type *ptr, bool owning = false)
2725
.. doxygenfunction:: make_tensor( Storage &&s, ShapeType &&shape)
2826
.. doxygenfunction:: make_tensor( TensorType &tensor, typename TensorType::storage_type &&s, typename TensorType::shape_container &&shape)
@@ -38,5 +36,4 @@ Return by Pointer
3836
.. doxygenfunction:: make_tensor_p( const index_t (&shape)[RANK], matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
3937
.. doxygenfunction:: make_tensor_p( ShapeType &&shape, matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
4038
.. doxygenfunction:: make_tensor_p( TensorType &tensor, typename TensorType::shape_container &&shape, matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
41-
.. doxygenfunction:: make_tensor_p( matxMemorySpace_t space = MATX_MANAGED_MEMORY, cudaStream_t stream = 0)
4239
.. doxygenfunction:: make_tensor_p( T *const data, ShapeType &&shape, bool owning = false)

docs/_sources/api/dft/fft/fft.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ Perform a 1D FFT
99
These functions are currently not supported with host-based executors (CPU)
1010

1111

12-
.. doxygenfunction:: fft(OpA &&a, uint64_t fft_size = 0)
13-
.. doxygenfunction:: fft(OpA &&a, const int32_t (&axis)[1], uint64_t fft_size = 0)
12+
.. doxygenfunction:: fft(OpA &&a, uint64_t fft_size = 0, FFTNorm norm = FFTNorm::BACKWARD)
13+
.. doxygenfunction:: fft(OpA &&a, const int32_t (&axis)[1], uint64_t fft_size = 0, FFTNorm norm = FFTNorm::BACKWARD)
1414

1515
Examples
1616
~~~~~~~~
@@ -25,7 +25,7 @@ Examples
2525
:language: cpp
2626
:start-after: example-begin fft-2
2727
:end-before: example-end fft-2
28-
:dedent:
28+
:dedent:
2929

3030
.. literalinclude:: ../../../../test/00_transform/FFT.cu
3131
:language: cpp
@@ -43,4 +43,4 @@ Examples
4343
:language: cpp
4444
:start-after: example-begin fft-5
4545
:end-before: example-end fft-5
46-
:dedent:
46+
:dedent:

docs/_sources/api/dft/fft/ifft.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@ Perform a 1D inverse FFT
99
These functions are currently not supported with host-based executors (CPU)
1010

1111

12-
.. doxygenfunction:: ifft(OpA &&a, uint64_t fft_size = 0)
13-
.. doxygenfunction:: ifft(OpA &&a, const int32_t (&axis)[1], uint64_t fft_size = 0)
12+
.. doxygenfunction:: ifft(OpA &&a, uint64_t fft_size = 0, FFTNorm norm = FFTNorm::BACKWARD)
13+
.. doxygenfunction:: ifft(OpA &&a, const int32_t (&axis)[1], uint64_t fft_size = 0, FFTNorm norm = FFTNorm::BACKWARD)
1414

1515
Examples
1616
~~~~~~~~
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
.. _isclose_func:
2+
3+
isclose
4+
=======
5+
6+
Determine the closeness of values across two operators using absolute and relative tolerances. The output
7+
from isclose is an ``int`` value since it's commonly used for reductions and ``bool`` reductions using
8+
atomics are not available in hardware.
9+
10+
11+
.. doxygenfunction:: isclose
12+
13+
Examples
14+
~~~~~~~~
15+
16+
.. literalinclude:: ../../../../test/00_operators/OperatorTests.cu
17+
:language: cpp
18+
:start-after: example-begin isclose-test-1
19+
:end-before: example-end isclose-test-1
20+
:dedent:
21+
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
.. _allclose_func:
2+
3+
allclose
4+
========
5+
6+
Reduce the closeness of two operators to a single scalar (0D) output. The output
7+
from allclose is an ``int`` value since boolean reductions are not available in hardware
8+
9+
10+
.. doxygenfunction:: allclose(OutType dest, const InType1 &in1, const InType2 &in2, double rtol, double atol, SingleThreadHostExecutor exec)
11+
.. doxygenfunction:: allclose(OutType dest, const InType1 &in1, const InType2 &in2, double rtol, double atol, cudaExecutor exec = 0)
12+
13+
Examples
14+
~~~~~~~~
15+
16+
.. literalinclude:: ../../../../test/00_operators/ReductionTests.cu
17+
:language: cpp
18+
:start-after: example-begin allclose-test-1
19+
:end-before: example-end allclose-test-1
20+
:dedent:
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
.. _overlap_func:
2+
3+
overlap
4+
#######
5+
6+
Create an overlapping view an of input operator giving a higher-rank view of the input
7+
8+
For example, the following 1D tensor [1 2 3 4 5] could be cloned into a 2d tensor with a
9+
window size of 2 and overlap of 1, resulting in::
10+
11+
[1 2
12+
2 3
13+
3 4
14+
4 5]
15+
16+
Currently this only works on 1D tensors going to 2D, but may be expanded
17+
for higher dimensions in the future. Note that if the window size does not
18+
divide evenly into the existing column dimension, the view may chop off the
19+
end of the data to make the tensor rectangular.
20+
21+
.. note::
22+
Only 1D input operators are accepted at this time
23+
24+
.. doxygenfunction:: overlap( const OpType &op, const index_t (&windows)[N], const index_t (&strides)[N])
25+
.. doxygenfunction:: overlap( const OpType &op, const std::array<index_t, N> &windows, const std::array<index_t, N> &strides)
26+
27+
Examples
28+
~~~~~~~~
29+
30+
.. literalinclude:: ../../../../test/00_operators/OperatorTests.cu
31+
:language: cpp
32+
:start-after: example-begin overlap-test-1
33+
:end-before: example-end overlap-test-1
34+
:dedent:
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
.. _at_func:
2+
3+
at
4+
==
5+
6+
Selects a single value from an operator. Since `at` is a lazily-evaluated operator, it should be used
7+
in situations where `operator()` cannot be used. For instance:
8+
9+
.. code-block:: cpp
10+
11+
(a = b(5)).run();
12+
13+
The code above creates a race condition where `b(5)` is evaluated on the host before launch, but the value may
14+
not be computed from a previous operation. Instead, the `at()` operator can be used to defer the load until
15+
the operation is launched:
16+
17+
.. code-block:: cpp
18+
19+
(a = at(b, 5)).run();
20+
21+
.. doxygenfunction:: at(const Op op, Is... indices)
22+
23+
Examples
24+
~~~~~~~~
25+
26+
.. literalinclude:: ../../../../test/00_operators/OperatorTests.cu
27+
:language: cpp
28+
:start-after: example-begin at-test-1
29+
:end-before: example-end at-test-1
30+
:dedent:
31+

0 commit comments

Comments
 (0)