Data Views and Layouts¶
This section contains an exercise file RAJA/exercises/view-layout.cpp
for you to work through if you wish to get some practice with RAJA. The
file RAJA/exercises/view-layout_solution.cpp contains complete
working code for the examples discussed in this section. You can use the
solution file to check your work and for guidance if you get stuck. To build
the exercises execute make view-layout and make view-layout_solution
from the build directory.
Key RAJA features shown in this section are:
RAJA::View
RAJA::LayoutandRAJA::OffsetLayoutconstructsLayout permutations
The examples in this section illustrate RAJA View and Layout concepts
and usage patterns. The goal is for you to gain an understanding of how
to use RAJA Views and Layouts to simplify and transform array data access
patterns. None of the examples use RAJA kernel execution methods, such
as RAJA::forall. The intent is to focus on RAJA View and Layout mechanics.
Consider a basic C-style implementation of a matrix-matrix multiplication operation, using \(N \times N\) matrices:
for (int row = 0; row < N; ++row) {
for (int col = 0; col < N; ++col) {
for (int k = 0; k < N; ++k) {
Cref[col + N*row] += A[k + N*row] * B[col + N*k];
}
}
}
As is commonly done for efficiency in C and C++, we have allocated the data
for the matrices as one-dimensional arrays. Thus, we need to manually compute
the data pointer offsets for the row and column indices in the kernel.
Here, we use the array Cref to hold a reference solution matrix that
we use to compare with results generated by the examples below.
To simplify the multi-dimensional indexing, we can use RAJA::View objects,
which we define as:
RAJA::View< double, RAJA::Layout<2, int, 1> > Aview(A, N, N);
RAJA::View< double, RAJA::Layout<2, int, 1> > Bview(B, N, N);
RAJA::View< double, RAJA::Layout<2, int, 1> > Cview(C, N, N);
Here we define three RAJA::View objects, ‘Aview’, ‘Bview’, and ‘Cview’,
that wrap the array data pointers, ‘A’, ‘B’, and ‘C’, respectively. We
pass a data pointer as the first argument to each view constructor and then
the extent of each matrix dimension as the second and third arguments. There
are two extent arguments since we indicate in the RAJA::Layout template
parameter list. The matrices are square and each extent is ‘N’. Here, the
template parameters to RAJA::View are the array data type ‘double’ and
a RAJA::Layout type. Specifically:
RAJA::Layout<2, int>
means that each View represents a two-dimensional default data layout, and that we will use values of type ‘int’ to index into the arrays.
Note
A third argument in the Layout type can be used to specify the index with unit stride:
RAJA::Layout<2, int, 1>
In the example above index 1 will be marked to have unit stride making multi-dimensional indexing more efficient by avoiding multiplication by 1 when it is unnecessary.
Using the RAJA::View objects, we can access the data entries for the rows
and columns using a more natural, less error-prone syntax:
for (int row = 0; row < N; ++row) {
for (int col = 0; col < N; ++col) {
for (int k = 0; k < N; ++k) {
Cview(row, col) += Aview(row, k) * Bview(k, col);
}
}
}
Default Layouts Use Row-major Ordering¶
The default data layout ordering in RAJA is row-major, which is the convention for multi-dimensional array indexing in C and C++. This means that the rightmost index will be stride-1, the index to the left of the rightmost index will have stride equal to the extent of the rightmost dimension, and so on.
Note
RAJA Layouts and Views support any number of dimensions and the default data access ordering is row-major. Please see View and Layout for more details.
To illustrate the default data layout striding, we next show simple one-, two-, and three-dimensional examples where the for-loop ordering for the different dimensions is such that all data access is stride-1. We begin by defining some dimensions, allocate and initialize arrays:
constexpr int Nx = 3;
constexpr int Ny = 5;
constexpr int Nz = 2;
constexpr int Ntot = Nx*Ny*Nz;
int* a = new int[ Ntot ];
int* aref = new int[ Ntot ];
for (int i = 0; i < Ntot; ++i)
{
aref[i] = i;
}
The version of the array initialization kernel using a one-dimensional
RAJA::View is:
RAJA::View< int, RAJA::Layout<1, int, 0> > view_1D(a, Ntot);
for (int i = 0; i < Ntot; ++i) {
view_1D(i) = i;
}
The version of the array initialization using a two-dimensional
RAJA::View is:
RAJA::View< int, RAJA::Layout<2, int, 1> > view_2D(a, Nx, Ny);
int iter{0};
for (int i = 0; i < Nx; ++i) {
for (int j = 0; j < Ny; ++j) {
view_2D(i, j) = iter;
++iter;
}
}
The three-dimensional version is:
RAJA::View< int, RAJA::Layout<3, int, 2> > view_3D(a, Nx, Ny, Nz);
iter = 0;
for (int i = 0; i < Nx; ++i) {
for (int j = 0; j < Ny; ++j) {
for (int k = 0; k < Nz; ++k) {
view_3D(i, j, k) = iter;
++iter;
}
}
}
It’s worth repeating that the data array access in all three variants shown
here using RAJA::View objects is stride-1 since we order the for-loops
in the loop nests to match the row-major ordering.
RAJA Layout types support other data access patterns with different striding orders, offsets, and permutations. To this point, we have used the default Layout constructor. RAJA provides methods to generate Layouts for different indexing patterns. We describe these in the next several sections. Next, we show how to permute the data striding order using permuted Layouts.
Permuted Layouts Change Data Striding Order¶
Every RAJA::Layout object has a permutation. When a permutation is not
specified at creation, a Layout will use the identity permutation. Here are
examples where the identity permutation is explicitly provided. First, in
two dimensions:
std::array<RAJA::idx_t, 2> defperm2 {{0, 1}};
RAJA::Layout< 2, int> defperm2_layout =
RAJA::make_permuted_layout( {{Nx, Ny}}, defperm2);
RAJA::View< int, RAJA::Layout<2, int, 1> > defperm_view_2D(a, defperm2_layout);
iter = 0;
for (int i = 0; i < Nx; ++i) {
for (int j = 0; j < Ny; ++j) {
defperm_view_2D(i, j) = iter;
++iter;
}
}
Then, in three dimensions:
std::array<RAJA::idx_t, 3> defperm3 {{0, 1, 2}};
RAJA::Layout< 3, int > defperm3_layout =
RAJA::make_permuted_layout( {{Nx, Ny, Nz}}, defperm3);
RAJA::View< int, RAJA::Layout<3, int, 2> > defperm_view_3D(a, defperm3_layout);
iter = 0;
for (int i = 0; i < Nx; ++i) {
for (int j = 0; j < Ny; ++j) {
for (int k = 0; k < Nz; ++k) {
defperm_view_3D(i, j, k) = iter;
++iter;
}
}
}
These two examples access the data with stride-1 ordering, the same as in
the earlier examples, which is shown by the nested loop ordering.
The identity permutation in two dimensions is ‘{0, 1}’ and is ‘{0, 1, 2}’
for three dimensions. The method RAJA::make_permuted_layout is used to
create a RAJA::Layout object with a permutation. The method takes two
arguments, the extents of each dimension and the permutation.
Note
If a permuted Layout is created with the identity permutation (e.g., {0,1,2}), the Layout is the same as if it were created by
Next, we permute the striding order for the two-dimensional example:
std::array<RAJA::idx_t, 2> perm2 {{1, 0}};
RAJA::Layout< 2, int> perm2_layout =
RAJA::make_permuted_layout( {{Nx, Ny}}, perm2);
RAJA::View< int, RAJA::Layout<2, int, 0> > perm_view_2D(a, perm2_layout);
iter = 0;
for (int j = 0; j < Ny; ++j) {
for (int i = 0; i < Nx; ++i) {
perm_view_2D(i, j) = iter;
++iter;
}
}
Read from right to left, the permutation ‘{1, 0}’ specifies that the first
(zero) index ‘i’ is stride-1, additionally captured in the RAJA::Layout,
and the second index (one) ‘j’ has stride equal to the extent of the first
Layout dimension ‘Nx’. This is evident in the for-loop ordering.
Here is the three-dimensional case, where we have reversed the striding order using the permutation ‘{2, 1, 0}’:
std::array<RAJA::idx_t, 3> perm3a {{2, 1, 0}};
RAJA::Layout< 3, int> perm3a_layout =
RAJA::make_permuted_layout( {{Nx, Ny, Nz}}, perm3a);
RAJA::View< int, RAJA::Layout<3, int, 0> > perm3a_view_3D(a, perm3a_layout);
iter = 0;
for (int k = 0; k < Nz; ++k) {
for (int j = 0; j < Ny; ++j) {
for (int i = 0; i < Nx; ++i) {
perm3a_view_3D(i, j, k) = iter;
++iter;
}
}
}
Note
As the index is now held by index 0 we adjust the Layout template argument accordingly:
RAJA::Layout<3, int, 0>
As before index 0 will be marked to have unit stride making multi-dimensional indexing more efficient by avoiding multiplication by 1 when it is unnecessary.
The data access remains stride-1 due to the for-loop reordering. For fun, here is another three-dimensional permutation:
std::array<RAJA::idx_t, 3> perm3b {{1, 2, 0}};
RAJA::Layout< 3, int > perm3b_layout =
RAJA::make_permuted_layout( {{Nx, Ny, Nz}}, perm3b);
RAJA::View< int, RAJA::Layout<3, int, 0> > perm3b_view_3D(a, perm3b_layout);
iter = 0;
for (int j = 0; j < Ny; ++j) {
for (int k = 0; k < Nz; ++k) {
for (int i = 0; i < Nx; ++i) {
perm3b_view_3D(i, j, k) = iter;
++iter;
}
}
}
The permutation is ‘{1, 2, 0}’ so to make the data access stride-1, we swap the ‘j’ and ‘k’ loops and leave the ‘i’ loop as the inner loop.
Multi-dimensional Indices and Linear Indices¶
RAJA::Layout types provide methods to convert between linear indices and
multi-dimensional indices and vice versa. Recall the Layout ‘perm3a_layout’
from above that was created with the permutation ‘{2, 1, 0}’. To get the
linear index corresponding to the index triple ‘(1, 2, 0)’, you can do
this:
int lin = perm3a_layout(1, 2, 0);
The value of ‘lin’ is 7 = 1 + 2 * Nx + 0 * Nx * Ny. To get the index triple for linear index 7, you can do:
int i, j, k;
perm3a_layout.toIndices(7, i, j, k);
This sets ‘i’ to 1, ‘j’ to 2, and ‘k’ to 0.
Similarly for the Layout ‘permb_layout’, which was created with the permutation ‘{1, 2, 0}’:
lin = perm3b_layout(1, 2, 0);
sets ‘lin’ to 13 = 1 + 0 * Nx + 2 * Nx * Nz and:
perm3b_layout.toIndices(13, i, j, k);
sets ‘i’ to 1, ‘j’ to 2, and ‘k’ to 0.
There are more examples in the exercise file associated with this section. Feel free to experiment with them.
One important item to note is that, by default, there is no bounds checking
on indices passed to a RAJA::View data access method or RAJA::Layout
index computation methods. Therefore, it is the responsibility of a user
to ensure that indices passed to RAJA::View and RAJA::Layout
methods are in bounds to avoid accessing data outside
of the View or computing invalid indices.
Note
RAJA provides a CMake variable RAJA_ENABLE_BOUNDS_CHECK to
turn run time bounds checking on or off when the code is compiled.
Enabling bounds checking is useful for debugging and to ensure
your code is correct. However, when enabled, bounds checking adds
noticeable run time overhead. So it should not be enabled for
a production build of your code.
Offset Layouts Apply Offsets to Indices¶
The last topic we cover in this exercise is the RAJA::OffsetLayout type.
We first illustrate the concept of an offset with a C-style for-loop:
int imin = -5;
int imax = 6;
for (int ii = imin; ii < imax; ++ii) {
ao_ref[ ii-imin ] = i;
}
Here, the for-loop runs from ‘imin’ to ‘imax-1’ (i.e., -5 to 5). To avoid out-of-bounds negative indexing, we subtract ‘imin’ (i.e., -5) from the loop index ‘i’.
To do the same thing with RAJA, we create a RAJA::OffsetLayout object
and use it to index into the array:
RAJA::OffsetLayout<1, int> offlayout_1D =
RAJA::make_offset_layout<1, int>( {{imin}}, {{imax}} );
RAJA::View< int, RAJA::OffsetLayout<1, int> > aoview_1Doff(ao,
offlayout_1D);
for (int ii = imin; ii < imax; ++ii) {
aoview_1Doff(ii) = ii;
}
RAJA::OffsetLayout is a different type than RAJA::Layout because
it contains offset information. The arguments to the
RAJA::make_offset_layout method are the index bounds.
As expected, the two dimensional case is similar. First, a C-style loop:
imin = -1;
imax = 2;
int jmin = -5;
int jmax = 5;
iter = 0;
for (int ii = imin; ii < imax; ++ii) {
for (int jj = jmin; jj < jmax; ++jj) {
ao_ref[ (jj-jmin) + (ii-imin) * (jmax-jmin) ] = iter;
iter++;
}
}
and then the same operation using a RAJA::OffsetLayout object:
RAJA::OffsetLayout<2, int> offlayout_2D =
RAJA::make_offset_layout<2, int>( {{imin, jmin}}, {{imax, jmax}} );
RAJA::View< int, RAJA::OffsetLayout<2, int> > aoview_2Doff(ao,
offlayout_2D);
iter = 0;
for (int ii = imin; ii < imax; ++ii) {
for (int jj = jmin; jj < jmax; ++jj) {
aoview_2Doff(ii, jj) = iter;
iter++;
}
}
Note that the first argument passed to RAJA::make_offset_layout contains
the lower bounds for ‘i’ and ‘j’ and the second argument contains the upper
bounds. Also, the ‘j’ index is stride-1 by default since we did not pass
a permutation to the RAJA::make_offset_layout method, which is the same
as the non-offset Layout usage.
Just like RAJA::Layout has a permutation, so does RAJA::OffsetLayout.
Here is an example where we permute the (i, j) index stride ordering:
std::array<RAJA::idx_t, 2> perm1D {{1, 0}};
RAJA::OffsetLayout<2> permofflayout_2D =
RAJA::make_permuted_offset_layout<2>( {{imin, jmin}},
{{imax, jmax}},
perm1D );
RAJA::View< int, RAJA::OffsetLayout<2> > aoview_2Dpermoff(ao,
permofflayout_2D);
iter = 0;
for (int jj = jmin; jj < jmax; ++jj) {
for (int ii = imin; ii < imax; ++ii) {
aoview_2Dpermoff(ii, jj) = iter;
iter++;
}
}
The permutation ‘{1, 0}’ is passed as the third argument to
RAJA::make_offset_layout. From the ordering of the for-loops, we can see
that the ‘i’ index is stride-1 and the ‘j’ index has stride equal to the
extent of the ‘i’ dimension so the for-loop nest strides through
the data with unit stride.