Data Views and Layouts¶
This section contains an exercise file RAJA/exercises/view-layout.cpp
for you to work through if you wish to get some practice with RAJA. The
file RAJA/exercises/view-layout_solution.cpp
contains complete
working code for the examples discussed in this section. You can use the
solution file to check your work and for guidance if you get stuck. To build
the exercises execute make view-layout
and make view-layout_solution
from the build directory.
Key RAJA features shown in this section are:
RAJA::View
RAJA::Layout
andRAJA::OffsetLayout
constructsLayout permutations
The examples in this section illustrate RAJA View and Layout concepts
and usage patterns. The goal is for you to gain an understanding of how
to use RAJA Views and Layouts to simplify and transform array data access
patterns. None of the examples use RAJA kernel execution methods, such
as RAJA::forall
. The intent is to focus on RAJA View and Layout mechanics.
Consider a basic C-style implementation of a matrix-matrix multiplication operation, using \(N \times N\) matrices:
for (int row = 0; row < N; ++row) {
for (int col = 0; col < N; ++col) {
for (int k = 0; k < N; ++k) {
Cref[col + N*row] += A[k + N*row] * B[col + N*k];
}
}
}
As is commonly done for efficiency in C and C++, we have allocated the data
for the matrices as one-dimensional arrays. Thus, we need to manually compute
the data pointer offsets for the row and column indices in the kernel.
Here, we use the array Cref
to hold a reference solution matrix that
we use to compare with results generated by the examples below.
To simplify the multi-dimensional indexing, we can use RAJA::View
objects,
which we define as:
RAJA::View< double, RAJA::Layout<2, int> > Aview(A, N, N);
RAJA::View< double, RAJA::Layout<2, int> > Bview(B, N, N);
RAJA::View< double, RAJA::Layout<2, int> > Cview(C, N, N);
Here we define three RAJA::View
objects, ‘Aview’, ‘Bview’, and ‘Cview’,
that wrap the array data pointers, ‘A’, ‘B’, and ‘C’, respectively. We
pass a data pointer as the first argument to each view constructor and then
the extent of each matrix dimension as the second and third arguments. There
are two extent arguments since we indicate in the RAJA::Layout
template
parameter list. The matrices are square and each extent is ‘N’. Here, the
template parameters to RAJA::View
are the array data type ‘double’ and
a RAJA::Layout
type. Specifically:
RAJA::Layout<2, int>
means that each View represents a two-dimensional default data layout, and that we will use values of type ‘int’ to index into the arrays.
Using the RAJA::View
objects, we can access the data entries for the rows
and columns using a more natural, less error-prone syntax:
for (int row = 0; row < N; ++row) {
for (int col = 0; col < N; ++col) {
for (int k = 0; k < N; ++k) {
Cview(row, col) += Aview(row, k) * Bview(k, col);
}
}
}
Default Layouts Use Row-major Ordering¶
The default data layout ordering in RAJA is row-major, which is the convention for multi-dimensional array indexing in C and C++. This means that the rightmost index will be stride-1, the index to the left of the rightmost index will have stride equal to the extent of the rightmost dimension, and so on.
Note
RAJA Layouts and Views support any number of dimensions and the default data access ordering is row-major. Please see View and Layout for more details.
To illustrate the default data layout striding, we next show simple one-, two-, and three-dimensional examples where the for-loop ordering for the different dimensions is such that all data access is stride-1. We begin by defining some dimensions, allocate and initialize arrays:
constexpr int Nx = 3;
constexpr int Ny = 5;
constexpr int Nz = 2;
constexpr int Ntot = Nx*Ny*Nz;
int* a = new int[ Ntot ];
int* aref = new int[ Ntot ];
for (int i = 0; i < Ntot; ++i)
{
aref[i] = i;
}
The version of the array initialization kernel using a one-dimensional
RAJA::View
is:
RAJA::View< int, RAJA::Layout<1, int> > view_1D(a, Ntot);
for (int i = 0; i < Ntot; ++i) {
view_1D(i) = i;
}
The version of the array initialization using a two-dimensional
RAJA::View
is:
RAJA::View< int, RAJA::Layout<2, int> > view_2D(a, Nx, Ny);
int iter{0};
for (int i = 0; i < Nx; ++i) {
for (int j = 0; j < Ny; ++j) {
view_2D(i, j) = iter;
++iter;
}
}
The three-dimensional version is:
RAJA::View< int, RAJA::Layout<3, int> > view_3D(a, Nx, Ny, Nz);
iter = 0;
for (int i = 0; i < Nx; ++i) {
for (int j = 0; j < Ny; ++j) {
for (int k = 0; k < Nz; ++k) {
view_3D(i, j, k) = iter;
++iter;
}
}
}
It’s worth repeating that the data array access in all three variants shown
here using RAJA::View
objects is stride-1 since we order the for-loops
in the loop nests to match the row-major ordering.
RAJA Layout types support other data access patterns with different striding orders, offsets, and permutations. To this point, we have used the default Layout constructor. RAJA provides methods to generate Layouts for different indexing patterns. We describe these in the next several sections. Next, we show how to permute the data striding order using permuted Layouts.
Permuted Layouts Change Data Striding Order¶
Every RAJA::Layout
object has a permutation. When a permutation is not
specified at creation, a Layout will use the identity permutation. Here are
examples where the identity permutation is explicitly provided. First, in
two dimensions:
std::array<RAJA::idx_t, 2> defperm2 {{0, 1}};
RAJA::Layout< 2, int > defperm2_layout =
RAJA::make_permuted_layout( {{Nx, Ny}}, defperm2);
RAJA::View< int, RAJA::Layout<2, int> > defperm_view_2D(a, defperm2_layout);
iter = 0;
for (int i = 0; i < Nx; ++i) {
for (int j = 0; j < Ny; ++j) {
defperm_view_2D(i, j) = iter;
++iter;
}
}
Then, in three dimensions:
std::array<RAJA::idx_t, 3> defperm3 {{0, 1, 2}};
RAJA::Layout< 3, int > defperm3_layout =
RAJA::make_permuted_layout( {{Nx, Ny, Nz}}, defperm3);
RAJA::View< int, RAJA::Layout<3, int> > defperm_view_3D(a, defperm3_layout);
iter = 0;
for (int i = 0; i < Nx; ++i) {
for (int j = 0; j < Ny; ++j) {
for (int k = 0; k < Nz; ++k) {
defperm_view_3D(i, j, k) = iter;
++iter;
}
}
}
These two examples access the data with stride-1 ordering, the same as in
the earlier examples, which is shown by the nested loop ordering.
The identity permutation in two dimensions is ‘{0, 1}’ and is ‘{0, 1, 2}’
for three dimensions. The method RAJA::make_permuted_layout
is used to
create a RAJA::Layout
object with a permutation. The method takes two
arguments, the extents of each dimension and the permutation.
Note
If a permuted Layout is created with the identity permutation (e.g., {0,1,2}), the Layout is the same as if it were created by
Next, we permute the striding order for the two-dimensional example:
std::array<RAJA::idx_t, 2> perm2 {{1, 0}};
RAJA::Layout< 2, int > perm2_layout =
RAJA::make_permuted_layout( {{Nx, Ny}}, perm2);
RAJA::View< int, RAJA::Layout<2, int> > perm_view_2D(a, perm2_layout);
iter = 0;
for (int j = 0; j < Ny; ++j) {
for (int i = 0; i < Nx; ++i) {
perm_view_2D(i, j) = iter;
++iter;
}
}
Read from right to left, the permutation ‘{1, 0}’ specifies that the first (zero) index ‘i’ is stride-1 and the second index (one) ‘j’ has stride equal to the extent of the first Layout dimension ‘Nx’. This is evident in the for-loop ordering.
Here is the three-dimensional case, where we have reversed the striding order using the permutation ‘{2, 1, 0}’:
std::array<RAJA::idx_t, 3> perm3a {{2, 1, 0}};
RAJA::Layout< 3, int > perm3a_layout =
RAJA::make_permuted_layout( {{Nx, Ny, Nz}}, perm3a);
RAJA::View< int, RAJA::Layout<3, int> > perm3a_view_3D(a, perm3a_layout);
iter = 0;
for (int k = 0; k < Nz; ++k) {
for (int j = 0; j < Ny; ++j) {
for (int i = 0; i < Nx; ++i) {
perm3a_view_3D(i, j, k) = iter;
++iter;
}
}
}
The data access remains stride-1 due to the for-loop reordering. For fun, here is another three-dimensional permutation:
std::array<RAJA::idx_t, 3> perm3b {{1, 2, 0}};
RAJA::Layout< 3, int > perm3b_layout =
RAJA::make_permuted_layout( {{Nx, Ny, Nz}}, perm3b);
RAJA::View< int, RAJA::Layout<3, int> > perm3b_view_3D(a, perm3b_layout);
iter = 0;
for (int j = 0; j < Ny; ++j) {
for (int k = 0; k < Nz; ++k) {
for (int i = 0; i < Nx; ++i) {
perm3b_view_3D(i, j, k) = iter;
++iter;
}
}
}
The permutation is ‘{1, 2, 0}’ so to make the data access stride-1, we swap the ‘j’ and ‘k’ loops and leave the ‘i’ loop as the inner loop.
Multi-dimensional Indices and Linear Indices¶
RAJA::Layout
types provide methods to convert between linear indices and
multi-dimensional indices and vice versa. Recall the Layout ‘perm3a_layout’
from above that was created with the permutation ‘{2, 1, 0}’. To get the
linear index corresponding to the index triple ‘(1, 2, 0)’, you can do
this:
int lin = perm3a_layout(1, 2, 0);
The value of ‘lin’ is 7 = 1 + 2 * Nx + 0 * Nx * Ny. To get the index triple for linear index 7, you can do:
int i, j, k;
perm3a_layout.toIndices(7, i, j, k);
This sets ‘i’ to 1, ‘j’ to 2, and ‘k’ to 0.
Similarly for the Layout ‘permb_layout’, which was created with the permutation ‘{1, 2, 0}’:
lin = perm3b_layout(1, 2, 0);
sets ‘lin’ to 13 = 1 + 0 * Nx + 2 * Nx * Nz and:
perm3b_layout.toIndices(13, i, j, k);
sets ‘i’ to 1, ‘j’ to 2, and ‘k’ to 0.
There are more examples in the exercise file associated with this section. Feel free to experiment with them.
One important item to note is that, by default, there is no bounds checking
on indices passed to a RAJA::View
data access method or RAJA::Layout
index computation methods. Therefore, it is the responsibility of a user
to ensure that indices passed to RAJA::View
and RAJA::Layoout
methods are in bounds to avoid accessing data outside
of the View or computing invalid indices.
Note
RAJA provides a CMake variable RAJA_ENABLE_BOUNDS_CHECK
to
turn run time bounds checking on or off when the code is compiled.
Enabling bounds checking is useful for debugging and to ensure
your code is correct. However, when enabled, bounds checking adds
noticeable run time overhead. So it should not be enabled for
a production build of your code.
Offset Layouts Apply Offsets to Indices¶
The last topic we cover in this exercise is the RAJA::OffsetLayout
type.
We first illustrate the concept of an offset with a C-style for-loop:
int imin = -5;
int imax = 6;
for (int i = imin; i < imax; ++i) {
ao_ref[ i-imin ] = i;
}
Here, the for-loop runs from ‘imin’ to ‘imax-1’ (i.e., -5 to 5). To avoid out-of-bounds negative indexing, we subtract ‘imin’ (i.e., -5) from the loop index ‘i’.
To do the same thing with RAJA, we create a RAJA::OffsetLayout
object
and use it to index into the array:
RAJA::OffsetLayout<1, int> offlayout_1D =
RAJA::make_offset_layout<1, int>( {{imin}}, {{imax}} );
RAJA::View< int, RAJA::OffsetLayout<1, int> > aoview_1Doff(ao,
offlayout_1D);
for (int i = imin; i < imax; ++i) {
aoview_1Doff(i) = i;
}
RAJA::OffsetLayout
is a different type than RAJA::Layout
because
it contains offset information. The arguments to the
RAJA::make_offset_layout
method are the index bounds.
As expected, the two dimensional case is similar. First, a C-style loop:
imin = -1;
imax = 2;
int jmin = -5;
int jmax = 5;
iter = 0;
for (int i = imin; i < imax; ++i) {
for (int j = jmin; j < jmax; ++j) {
ao_ref[ (j-jmin) + (i-imin) * (jmax-jmin) ] = iter;
iter++;
}
}
and then the same operation using a RAJA::OffsetLayout
object:
RAJA::OffsetLayout<2, int> offlayout_2D =
RAJA::make_offset_layout<2, int>( {{imin, jmin}}, {{imax, jmax}} );
RAJA::View< int, RAJA::OffsetLayout<2, int> > aoview_2Doff(ao,
offlayout_2D);
iter = 0;
for (int i = imin; i < imax; ++i) {
for (int j = jmin; j < jmax; ++j) {
aoview_2Doff(i, j) = iter;
iter++;
}
}
Note that the first argument passed to RAJA::make_offset_layout
contains
the lower bounds for ‘i’ and ‘j’ and the second argument contains the upper
bounds. Also, the ‘j’ index is stride-1 by default since we did not pass
a permutation to the RAJA::make_offset_layout
method, which is the same
as the non-offset Layout usage.
Just like RAJA::Layout
has a permutation, so does RAJA::OffsetLayout
.
Here is an example where we permute the (i, j) index stride ordering:
std::array<RAJA::idx_t, 2> perm1D {{1, 0}};
RAJA::OffsetLayout<2> permofflayout_2D =
RAJA::make_permuted_offset_layout<2>( {{imin, jmin}},
{{imax, jmax}},
perm1D );
RAJA::View< int, RAJA::OffsetLayout<2> > aoview_2Dpermoff(ao,
permofflayout_2D);
iter = 0;
for (int j = jmin; j < jmax; ++j) {
for (int i = imin; i < imax; ++i) {
aoview_2Dpermoff(i, j) = iter;
iter++;
}
}
The permutation ‘{1, 0}’ is passed as the third argument to
RAJA::make_offset_layout
. From the ordering of the for-loops, we can see
that the ‘i’ index is stride-1 and the ‘j’ index has stride equal to the
extent of the ‘i’ dimension so the for-loop nest strides through
the data with unit stride.