Workshop Nvidia HPC SDK

From HPC users
Jump to navigationJump to search


The new NVIDIA HPC SDK significantly reshapes the diverse GPU metaprogramming approaches previously taken by OpenACC and Thrust/Bolt. With a new GPU-enabled std::par framework, a developer gets a vendor-neutral access to multicore CPUs and GPUs in a more unified fashion. In this webinar, we explain how to start using std::par in existing applications in typical software engineering scenarios. Furthermore, we focus on the new Nsight profiling tools, which are now free of a lot of limitations of the previous generation NVIDIA Visual Profiler. Finally, the webinar covers the new CUDA Fortran features, as well as the Python bindings essential for any modern compute application.

Date and Time

Tuesday, July 6th from 10am to 1pm The workshop will be held online, details will follow shortly before the meeting.


Please register for the course by signing up for it in Stud.IP (if the link does not work for you, please search for the workshop number 05.WR.2002). Please note, that course will probably take place outside of Stud.IP. Details will be provided by mail.


Session 1 (10:00–11:30): New NVIDIA HPC SDK for More Productive GPU Computing

  • New focuses of NVIDIA HPC SDK: performance, portability and productivity
  • How the new HPC SDK is different from CUDA Toolkit and PGI Compiler Suite
  • C++17 std::par constructs as an alternative to OpenACC directives and CUDA Thrust
    • Essential elements: containers, iterators and lamda functions
    • Thrust-like counting iterators
    • Explicit and implicit memory transfers
  • NVIDIA HPC SDK in the programming ecosystem
    • Intermixing HPC SDK and CUDA Toolkit
    • CMake support for NVC++ compiler
    • Switching between std::par backends, targeting different GPUs with SyclParallelSTL

Session 2 (11:40-13:00): Profiling, language and binding features of NVIDIA HPC SDK

  • Next generation profiling tools: NSight Systems and NSight Compute
    • Connecting Nsight profilers to remote cloud GPUs
  • Fortran 2003/2008 and CUDA Fortran in NVIDIA HPC SDK
  • Python bindings for GPU applications
    • Using NVC++ compiler to build Python API around native GPU code
    • Preparing a pybind11 wrapper for C++17 std::par and CUDA Fortran applications

Course Materials

The online session will be accompanied by an optional offline practical hands-on, Q&A, and submissions review. Attendees will have access to a JupyterLab environment with NVIDIA GPUs for the duration of the sessions. All corresponding presentations and code samples will be available to attendees as a downloadable package.

The slides from the workshop and more are provided here:

  • Nvidia HPC SDK and std::par (the slides that were shown during workshop) [PDF]
  • Advanced C++ Parallel STL by example of Thrust [PDF]
  • Next generation profiling tools: NSight Systems and NSight Compute [PDF]

The code example is available on GitHub:

Instructions for using the example (not yet complete, I am still trying to get it to work):

$ git clone ...
$ cd arrsin
$ module load hpc-env/9.2 CMake tbb NVHPC/21.3-GCC-9.2.0
$ mkdir build
$ cd build
$ cmake ..
$ make   # this will fail, working on a solution


Information about the workshop is summarized in this flyer.