Difference between revisions of "OpenACC"

From HPC users
Jump to navigationJump to search
Line 6: Line 6:


Support of OpenACC is available in commercial compilers from [[PGI Compiler|PGI]]. Tutorials and documantation can be found on http://www.openacc.org/.
Support of OpenACC is available in commercial compilers from [[PGI Compiler|PGI]]. Tutorials and documantation can be found on http://www.openacc.org/.
== Example: Jacobi Iteration ==
'''The code'''
The following example is taken from [https://devblogs.nvidia.com/parallelforall/cudacasts-episode-3-your-first-openacc-program/ this link] and the code can be downloaded [https://github.com/parallel-forall/cudacasts/tree/master/ep3-first-openacc-program here] or using
wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/laplace2d.c
wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/timer.h
The codes performs a Jacobi Iteration on an 4096x4096 grid.
'''Modules'''
Since we need a compiler with OpenACC support and we want to use the GPUS we load the following modules:
module load PGI CUDA-Toolkit
'''Serial execution of the program'''
The serial version of the program can be compiled with
pgcc -fast -o laplace2d_ser laplace2d.c
To run the executable on a compute node we can sue the command
srun -p carl.p ./laplace2d_ser
which after some time should print
Jacobi relaxation Calculation: 4096 x 4096 mesh
    0, 0.250000
  100, 0.002397
  200, 0.001204
  300, 0.000804
  400, 0.000603
  500, 0.000483
  600, 0.000403
  700, 0.000345
  800, 0.000302
  900, 0.000269
total: 92.824342 s
srun: error: mpcl001: task 0: Exited with exit code 20
The total runtime may differ depending on the compute node used and other job that may run on that node (use -exclusive to rule that out). Note: the cause of the exit code 20 (or any other number) is not clear at the moment but may be ignored.

Revision as of 14:16, 27 March 2017

Introduction

OpenACC (for open accelerators) is a programming standard for parallel computing developed by Cray, CAPS, Nvidia and PGI. The standard is designed to simplify parallel programming of heterogeneous CPU/GPU systems. [1]

Like in OpenMP, the programmer can annotate C, C++ and Fortran source code to identify the areas that should be accelerated using compiler directives and additional functions (see here for more information]. Like OpenMP 4.0 and newer, code can be started on both the CPU and GPU.

Support of OpenACC is available in commercial compilers from PGI. Tutorials and documantation can be found on http://www.openacc.org/.

Example: Jacobi Iteration

The code

The following example is taken from this link and the code can be downloaded here or using

wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/laplace2d.c
wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/timer.h

The codes performs a Jacobi Iteration on an 4096x4096 grid.

Modules

Since we need a compiler with OpenACC support and we want to use the GPUS we load the following modules:

module load PGI CUDA-Toolkit

Serial execution of the program

The serial version of the program can be compiled with

pgcc -fast -o laplace2d_ser laplace2d.c 

To run the executable on a compute node we can sue the command

srun -p carl.p ./laplace2d_ser

which after some time should print

Jacobi relaxation Calculation: 4096 x 4096 mesh
   0, 0.250000
 100, 0.002397
 200, 0.001204
 300, 0.000804
 400, 0.000603
 500, 0.000483
 600, 0.000403
 700, 0.000345
 800, 0.000302
 900, 0.000269
total: 92.824342 s
srun: error: mpcl001: task 0: Exited with exit code 20

The total runtime may differ depending on the compute node used and other job that may run on that node (use -exclusive to rule that out). Note: the cause of the exit code 20 (or any other number) is not clear at the moment but may be ignored.