OpenACC

From HPC users
Jump to navigationJump to search

Introduction

OpenACC (for open accelerators) is a programming standard for parallel computing developed by Cray, CAPS, Nvidia and PGI. The standard is designed to simplify parallel programming of heterogeneous CPU/GPU systems. [1]

Like in OpenMP, the programmer can annotate C, C++ and Fortran source code to identify the areas that should be accelerated using compiler directives and additional functions (see here for more information]. Like OpenMP 4.0 and newer, code can be started on both the CPU and GPU.

Support of OpenACC is available in commercial compilers from PGI. Tutorials and documantation can be found on http://www.openacc.org/.

Example: Jacobi Iteration

The code

The following example is taken from this link and the code can be downloaded here or using

wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/laplace2d.c
wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/timer.h

The codes performs a Jacobi Iteration on an 4096x4096 grid.

Modules

Since we need a compiler with OpenACC support and we want to use the GPUS we load the following modules:

module load PGI CUDA-Toolkit

Serial execution of the program

The serial version of the program can be compiled with

pgcc -fast -o laplace2d_ser laplace2d.c 

To run the executable on a compute node we can sue the command

srun -p carl.p ./laplace2d_ser

which after some time should print

Jacobi relaxation Calculation: 4096 x 4096 mesh
   0, 0.250000
 100, 0.002397
 200, 0.001204
 300, 0.000804
 400, 0.000603
 500, 0.000483
 600, 0.000403
 700, 0.000345
 800, 0.000302
 900, 0.000269
total: 92.824342 s
srun: error: mpcl001: task 0: Exited with exit code 20

The total runtime may differ depending on the compute node used and other job that may run on that node (use -exclusive to rule that out). Note: the cause of the exit code 20 (or any other number) is not clear at the moment but may be ignored.