Scans¶

RAJA provides portable parallel scan operations, which are basic parallel algorithm building blocks. They are described in this section.

A few important notes:

Note

• All RAJA scan operations are in the namespace RAJA.
• Each RAJA scan operation is a template on an execution policy parameter. The same policy types used for RAJA::forall methods may be used for RAJA scans.
• RAJA scan operations accept an optional operator argument so users can perform different types of scan operations. If no operator is given, the default is a ‘plus’ operation and the result is a prefix-sum.

Also:

Note

For scans using the CUDA back-end, RAJA uses the NVIDIA cub library internally, which is available in the RAJA source repository as a Git submodule. The CMake variable CUB_DIR will be automatically set to the location of the cub library when CUDA is enabled. Details for using a different version of the cub library are available in the Getting Started With RAJA section.

Please see the Parallel Scan Operations tutorial section for usage examples of RAJA scan operations.

Scan Operations¶

In general, a scan operation takes a sequence of numbers ‘x’ and a binary associative operator ‘op’ as input and produces another sequence of numbers ‘y’ as output. Each element of the output sequence is formed by applying the operator to a subset of the input. Scans come in two flavors: inclusive and exclusive.

An inclusive scan takes the input sequence

x = { x0, x1, x2, … }

and calculates the output sequence:

y = { y0, y1, y2, … }

using the recursive definition

y0= x0

yi= yi-1op xi, for each i > 0

An exclusive scan is similar, but the output of an exclusive scan is different from the output of an inclusive scan in two ways. First, the first element of the output is the identity of the operator used. Second, the rest of the output sequence is the same as inclusive scan, but shifted one position to the right; i.e.,

y0= opidentity

yi= yi-1 op xi-1, for each i > 0

If you would like more information about scan operations, a good overview of what they are and why they are useful can be found in Blelloch Scan Lecture Notes. A nice presentation that describes how parallel scans are implemented is Va Tech Scan Lecture

RAJA Inclusive Scans¶

RAJA inclusive scan operations look like the following:

• RAJA::inclusive_scan< exec_policy >(in, in + N, out)
• RAJA::inclusive_scan< exec_policy >(in, in + N, out, operator)

Here, ‘in’ and ‘out’ are pointers to arrays of some numeric scalar type whose elements are the input and output sequences of the scan, respectively. The scalar type must be the same for both arrays. The first scan operation above will be a prefix-sum since there is no operator argument given; i.e., the output array will contain partial sums of the input array. The second scan will apply the operator that is passed.

RAJA also provides in-place scans:

• RAJA::inclusive_scan_inplace< exec_policy >(in, in + N)
• RAJA::inclusive_scan_inplace< exec_policy >(in, in + N, <operator>)

An in-place scan generates the same output sequence as a non-inplace scan. However, an in-place scan does not take separate input and output arrays and the result of the scan operation will appear in-place in the input array.

RAJA Exclusive Scans¶

Using RAJA exclusive scans is essentially the same as for inclusive scans:

• RAJA::exclusive_scan< exec_policy >(in, in + N, out)
• RAJA::exclusive_scan< exec_policy >(in, in + N, out, operator)

and

• RAJA::exclusive_scan_inplace< exec_policy >(in, in + N)
• RAJA::exclusive_scan_inplace< exec_policy >(in, in + N, <operator>)

RAJA Scan Operators¶

RAJA provides a variety of operators that can be used to perform different types of scans, such as:

• RAJA::operators::plus<T>
• RAJA::operators::minus<T>
• RAJA::operators::multiplies<T>
• RAJA::operators::divides<T>
• RAJA::operators::minimum<T>
• RAJA::operators::maximum<T>

Note

• All RAJA scan operators are in the namespace RAJA::operators.

Scan Policies¶

For information about RAJA execution policies to use with scan operations, please see Policies.