Difference between revisions of "OpenACC"
Line 6: | Line 6: | ||
Support of OpenACC is available in commercial compilers from [[PGI Compiler|PGI]]. Tutorials and documantation can be found on http://www.openacc.org/. | Support of OpenACC is available in commercial compilers from [[PGI Compiler|PGI]]. Tutorials and documantation can be found on http://www.openacc.org/. | ||
== Example: Jacobi Iteration == | |||
'''The code''' | |||
The following example is taken from [https://devblogs.nvidia.com/parallelforall/cudacasts-episode-3-your-first-openacc-program/ this link] and the code can be downloaded [https://github.com/parallel-forall/cudacasts/tree/master/ep3-first-openacc-program here] or using | |||
wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/laplace2d.c | |||
wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/timer.h | |||
The codes performs a Jacobi Iteration on an 4096x4096 grid. | |||
'''Modules''' | |||
Since we need a compiler with OpenACC support and we want to use the GPUS we load the following modules: | |||
module load PGI CUDA-Toolkit | |||
'''Serial execution of the program''' | |||
The serial version of the program can be compiled with | |||
pgcc -fast -o laplace2d_ser laplace2d.c | |||
To run the executable on a compute node we can sue the command | |||
srun -p carl.p ./laplace2d_ser | |||
which after some time should print | |||
Jacobi relaxation Calculation: 4096 x 4096 mesh | |||
0, 0.250000 | |||
100, 0.002397 | |||
200, 0.001204 | |||
300, 0.000804 | |||
400, 0.000603 | |||
500, 0.000483 | |||
600, 0.000403 | |||
700, 0.000345 | |||
800, 0.000302 | |||
900, 0.000269 | |||
total: 92.824342 s | |||
srun: error: mpcl001: task 0: Exited with exit code 20 | |||
The total runtime may differ depending on the compute node used and other job that may run on that node (use -exclusive to rule that out). Note: the cause of the exit code 20 (or any other number) is not clear at the moment but may be ignored. |
Revision as of 14:16, 27 March 2017
Introduction
OpenACC (for open accelerators) is a programming standard for parallel computing developed by Cray, CAPS, Nvidia and PGI. The standard is designed to simplify parallel programming of heterogeneous CPU/GPU systems. [1]
Like in OpenMP, the programmer can annotate C, C++ and Fortran source code to identify the areas that should be accelerated using compiler directives and additional functions (see here for more information]. Like OpenMP 4.0 and newer, code can be started on both the CPU and GPU.
Support of OpenACC is available in commercial compilers from PGI. Tutorials and documantation can be found on http://www.openacc.org/.
Example: Jacobi Iteration
The code
The following example is taken from this link and the code can be downloaded here or using
wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/laplace2d.c wget https://raw.githubusercontent.com/parallel-forall/cudacasts/master/ep3-first-openacc-program/timer.h
The codes performs a Jacobi Iteration on an 4096x4096 grid.
Modules
Since we need a compiler with OpenACC support and we want to use the GPUS we load the following modules:
module load PGI CUDA-Toolkit
Serial execution of the program
The serial version of the program can be compiled with
pgcc -fast -o laplace2d_ser laplace2d.c
To run the executable on a compute node we can sue the command
srun -p carl.p ./laplace2d_ser
which after some time should print
Jacobi relaxation Calculation: 4096 x 4096 mesh 0, 0.250000 100, 0.002397 200, 0.001204 300, 0.000804 400, 0.000603 500, 0.000483 600, 0.000403 700, 0.000345 800, 0.000302 900, 0.000269 total: 92.824342 s srun: error: mpcl001: task 0: Exited with exit code 20
The total runtime may differ depending on the compute node used and other job that may run on that node (use -exclusive to rule that out). Note: the cause of the exit code 20 (or any other number) is not clear at the moment but may be ignored.