.. ## .. ## Copyright (c) Lawrence Livermore National Security, LLC and other .. ## RAJA Project Developers. See top-level LICENSE and COPYRIGHT .. ## files for dates and other details. No copyright assignment is required .. ## to contribute to RAJA. .. ## .. ## SPDX-License-Identifier: (BSD-3-Clause) .. ## .. _cook-book-reductions-label: ======================= Cooking with Reductions ======================= Please see the following section for overview discussion about RAJA reductions: * :ref:`feat-reductions-label`. ---------------------------- Reductions with RAJA::forall ---------------------------- Here is the setup for a simple reduction example:: const int N = 1000; int vec[N]; for (int i = 0; i < N; ++i) { vec[i] = 1; } Here a simple sum reduction is performed in a for loop:: int vsum = 0; // Run a kernel using the reduction objects for (int i = 0; i < N; ++i) { vsum += vec[i]; } The results of these operations will yield the following values: * ``vsum == 1000`` RAJA uses policy types to specify how things are implemented. The forall *execution policy* specifies how the loop is run by the ``RAJA::forall`` method. The following discussion includes examples of several other RAJA execution policies that could be applied. For example ``RAJA::seq_exec`` runs a C-style for loop sequentially on a CPU. The ``RAJA::cuda_exec_with_reduce<256>`` runs the loop as a CUDA GPU kernel with 256 threads per block and other CUDA kernel launch parameters, like the number of blocks, optimized for performance with reducers.:: using exec_policy = RAJA::seq_exec; // using exec_policy = RAJA::omp_parallel_for_exec; // using exec_policy = RAJA::omp_target_parallel_for_exec<256>; // using exec_policy = RAJA::cuda_exec_with_reduce<256>; // using exec_policy = RAJA::hip_exec_with_reduce<256>; // using exec_policy = RAJA::sycl_exec<256>; The reduction policy specifies how the reduction is done and must match the execution policy. For example ``RAJA::seq_reduce`` does a sequential reduction and can only be used with sequential execution policies. The ``RAJA::cuda_reduce_atomic`` policy uses atomics, if possible with the given data type, and can only be used with cuda execution policies. Similarly for other RAJA execution back-ends, such as HIP and OpenMP. Here are example RAJA reduction policies whose names are indicative of which execution policies they work with:: using reduce_policy = RAJA::seq_reduce; // using reduce_policy = RAJA::omp_reduce; // using reduce_policy = RAJA::omp_target_reduce; // using reduce_policy = RAJA::cuda_reduce_atomic; // using reduce_policy = RAJA::hip_reduce_atomic; // using reduce_policy = RAJA::sycl_reduce; Here a simple sum reduction is performed using RAJA:: RAJA::ReduceSum vsum(0); RAJA::forall( RAJA::RangeSegment(0, N), [=](RAJA::Index_type i) { vsum += vec[i]; }); The results of these operations will yield the following values: * ``vsum.get() == 1000`` Another option for the execution policy when using the cuda or hip backends are the base policies which have a boolean parameter to choose between the general use ``cuda/hip_exec`` policy and the ``cuda/hip_exec_with_reduce`` policy.:: // static constexpr bool with_reduce = ...; // using exec_policy = RAJA::cuda_exec_base; // using exec_policy = RAJA::hip_exec_base; Another option for the reduction policy when using the cuda or hip backends are the base policies which have a boolean parameter to choose between the atomic ``cuda/hip_reduce_atomic`` policy and the non-atomic ``cuda/hip_reduce`` policy.:: // static constexpr bool with_atomic = ...; // using reduce_policy = RAJA::cuda_reduce_base; // using reduce_policy = RAJA::hip_reduce_base;