Continuous Integration (CI) Testing Maintenance Tasks¶
In Continuous Integration (CI) Testing, we described RAJA CI workflows. This section describes how to perform common RAJA CI testing maintenance tasks.
GitLab CI Tasks¶
The tasks in this section apply to GitLab CI testing on Livermore Computing (LC) platforms. LC folks and others maintain Confluence pages with a lot of useful information for setting up and maintaining GitLab CI for a project, mirroring a GitHub to GitLab, etc. Please refer to LC GitLab CI for such information.
Changing build and test configurations¶
The build for each test we run is defined by a Spack spec in one of two places, depending on whether it is shared with other projects or it is specific to RAJA. The details are described in Launching CI pipelines (step 2).
Remove a configuration¶
To remove a RAJA-specific test configuration, simply delete the entry for it in
the RAJA/.gitlab/jobs/<MACHINE>.yml
file where it is defined. Here,
MACHINE
is the name of an LC platform where GitLab CI is run.
To remove a shared configuration, it must be removed from the appropriate
gitlab/radiuss-jobs/<MACHINE>.yml
file in the RADIUSS Spack Configs project. Create a branch
there, remove the job entry, and create a pull request.
Important
The RADIUSS Spack Configs project is used by several other projects. When changing a shared configuration file, please make sure the change is OK with those other projects.
Add a configuration¶
To add a RAJA-specific test configuration, add an entry for it to the
RAJA/.gitlab/jobs/<MACHINE>.yml
file, where MACHINE
is the name of the
LC platform where it will be run. When adding a test configuration, it is
important to note two items that must be specified properly:
A unique job label, which identifies it in the machine configuration file and also on a web page for a GitLab CI pipeline
A build Spack spec, which identifies the compiler and version, compiler flags, build options, etc.
For example, an entry for a build using the clang 12.0.1 compiler with CUDA 11.5.0 on the LC lassen machine would be something like this:
clang_12_0_1_cuda_11_5_0:
variables:
SPEC: " ~shared +openmp +tests +cuda cuda_arch=70 %clang@12.0.1 ^cuda@11.5.0"
extends: .job_on_lassen
Here, we enable OpenMP and CUDA, both of which must be enabled to test those RAJA back-ends, and specify the CUDA target architecture ‘sm_70’.
To add a shared configuration, it must be added to the appropriate
gitlab/radiuss-jobs/<MACHINE>.yml
file in the RADIUSS Spack Configs project. Create a branch
there, add the job entry, and create a pull request.
Important
The RADIUSS Spack Configs project is used by several other projects. When changing a shared configuration file, please make sure the change is OK with those other projects.
Modifying a configuration¶
To change an existing configuration, change the relevant information in the
configuration in the appropriate RAJA/.gitlab/jobs/<MACHINE>.yml
file. Make
sure to also modify the job label as needed, so it is descriptive of the
configuration (and remains unique!!).
To modify a shared configuration, it must be changed in the appropriate
gitlab/radiuss-jobs/<MACHINE>.yml
file in the RADIUSS Spack Configs project. Create a branch
there, modify the job entry, and create a pull request.
Important
Build spec information used in RAJA GitLab CI pipelines must
exist in the compilers.yaml
file and/or packages.yaml
file for the
appropriate system type in the RADIUSS Spack Configs repo.
If the desired entry is not there, but exists in a newer version of the RADIUSS Spack Configs project, update the RAJA submodule to use the newer version. If the information does not exist in any version of the RADIUSS Spack Configs project, create a branch there, add the needed spec info, and create a pull request. Then, when that PR is merged, update the RAJA submodule for the RADIUSS Spack Configs project to the new version.
Changing run parameters¶
The parameters for each system/scheduler on which we run GitLab CI for
RAJA, such as job time limits, resource allocations, etc. are defined in the
RAJA/.gitlab/custom-jobs-and-variables.yml
file. This information can
remain as is, for the most part, and should not be changed unless absolutely
necessary.
For example, sometimes a particular job will take longer to build and run than
the default allotted time for jobs on a machine. In this case, the time for the
job can be adjusted in the job entry in the associated
RAJA/.gitlab/jobs/<MACHINE>.yml
file. For example:
gcc_8_1_0:
variables:
SPEC: " ${PROJECT_RUBY_VARIANTS} %gcc@8.1.0 ${PROJECT_RUBY_DEPS}"
RUBY_BUILD_AND_TEST_JOB_ALLOC: "--time=60 --nodes=1"
extends: .job_on_ruby
This example sets the build and test allocation time to 60 minutes and the the run resource to one node.
Allowing failures¶
Sometimes a shared job configuration is known to fail for RAJA. To allow the job to fail without the CI check associated with it failing, we can annotate the job for this. For example:
ibm_clang_9_0_0:
variables:
SPEC: " ${PROJECT_LASSEN_VARIANTS} %clang@ibm.9.0.0 ${PROJECT_LASSEN_DEPS}"
extends: .job_on_lassen
allow_failure: true
Important
When a shared job needs to be modified for RAJA specifically, we
call that “overriding”: The job label must be kept the same as in the
gitlab/radiuss-jobs/<MACHINE>.yml
file in the RADIUSS Spack Configs, and the RAJA-specific job
can be adapted. If you override a shared job, please add a comment to
describe the change in the RAJA/.gitlab/jobs/<MACHINE>.yml
file where
the job is overridden.
Azure CI Tasks¶
The tasks in this section apply to RAJA Azure Pipelines CI.
Changing Builds/Container Images¶
The builds we run in Azure are defined in the RAJA/azure-pipelines.yml file.
Linux/Docker¶
To update or add a new compiler / job to Azure CI we need to edit both azure-pipelines.yml
and Dockerfile
.
If we want to add a new Azure pipeline to build with compilerX
, then in azure-pipelines.yml
we can add the job like so:
-job: Docker
...
strategy:
matrix:
...
compilerX:
docker_target: compilerX
Here, compilerX:
defines the name of a job in Azure. docker_target: compilerX
defines a variable docker_target
, which is used to determine what part of the Dockerfile
to run.
In the Dockerfile
we will want to add our section that defines the commands for the compilerX
job.:
FROM ghcr.io/rse-ops/compilerX-ubuntu-20.04:compilerX-XXX AS compilerX
ENV GTEST_COLOR=1
COPY . /home/raja/workspace
WORKDIR /home/raja/workspace/build
RUN cmake -DCMAKE_CXX_COMPILER=compilerX ... && \
make -j 6 && \
ctest -T test --output-on-failure && \
cd .. && rm -rf build
Each of our docker builds is built up on a base image maintained by RSE-Ops, a table of available base containers can be found here. We are also able to add target names to each build with AS ...
. This target name correlates to the docker_target: ...
defined in azure-pipelines.yml
.
The base containers are shared across multiple projects and are regularly rebuilt. If bugs are fixed in the base containers the changes will be automatically propagated to all projects using them in their Docker builds.
Check here for a list of all currently available RSE-Ops containers. Please see the RSE-Ops Containers Project on GitHub to get new containers built that aren’t yet available.
Windows / MacOS¶
We run our Windows / MacOS builds directly on the Azure virtual machine instances. In order to update the Windows / MacOS instance we can change the pool
under -job: Windows
or -job: Mac
:
-job: Windows
...
pool:
vmImage: 'windows-2019'
...
-job: Mac
...
pool:
vmImage: 'macOS-latest'
Changing Build/Run Parameters¶
Linux/Docker¶
We can edit the build and run configurations of each docker build, in the RUN
command. Such as adding CMake options or changing the parallel build value of make -j N
for adjusting throughput.
Each base image is built using spack. For the most part the container environments are set up to run our CMake and build commands out of the box. However, there are a few exceptions where we need to spack load
specific modules into the path.
Clang requires us to load LLVM for OpenMP runtime libraries.:
. /opt/spack/share/spack/setup-env.sh && spack load llvmCUDA for the cuda runtime.:
. /opt/spack/share/spack/setup-env.sh && spack load cudaHIP for the hip runtime and llvm-amdgpu runtime libraries.:
. /opt/spack/share/spack/setup-env.sh && spack load hip llvm-amdgpuSYCL requires us to run setupvars.sh:
source /opt/view/setvars.sh
Windows / MacOS¶
Windows and MacOS build / run parameters can be configured directly in azure-pipelines.yml
. CMake options can be configured with CMAKE_EXTRA_FLAGS
for each job. The -j
value can also be edited directly in the Azure script
definitions for each job.
The commands executed to configure, build, and test RAJA for each
pipeline in Azure are located in the RAJA/Dockerfile file.
Each pipeline section begins with a line that ends with AS ...
where the ellipses in the name of a build-test pipeline. The name label
matches an entry in the Docker test matrix in the
RAJA/azure-pipelines.yml
file mentioned above.
RAJA Performance Suite CI Tasks¶
The RAJA Performance Suite project CI testing processes, directory/file structure, and dependencies are nearly identical to that for RAJA, which is described in Continuous Integration (CI) Testing. Specifically,
The RAJA Performance Suite GitLab CI process is driven by the RAJAPerf/.gitlab-ci.yml file.
The
custom-jobs-and-variables.yml
andsubscribed-pipelines.yml
files reside in the RAJAPerf/.gitlab directory.The
build_and_test.sh
script resides in the RAJAPerf/scripts/gitlab directory.The RAJAPerf/Dockerfile drives the Azure testing pipelines.
The Performance Suite GitLab CI uses the uberenv
and
radiuss-spack-configs
versions located in the RAJA submodule to make the
testing consistent across projects and avoid redundancy. This is reflected in
the RAJAPerf/.uberenv_config.json file
which point at the relevant RAJA submodule locations. That is the paths contain
tpl/RAJA/...
.
Apart from this minor difference, all CI maintenance and development tasks for the RAJA Performance Suite follow the same pattern that is described in Continuous Integration (CI) Testing Maintenance Tasks.