Difference between revisions of "MDCS with GPUs"

From HPC users
Jump to navigationJump to search
(created)
 
 
(3 intermediate revisions by the same user not shown)
Line 5: Line 5:
== SLURM Integration ==
== SLURM Integration ==


In order to use GPUs within a Matlab worker you need to use a slightly modified client configuration by replacing the files from the SLURM integration with the files from this file. Currently only R2017b is supported. Everything else is unchanged regarding the configuration of MDCS and everything that worked before will still work.  
In order to use GPUs within a Matlab worker you need to use a slightly modified client configuration by replacing the files from the SLURM integration with the files from [[media:MDCS_SLURM-Integration_R2017b_withGPU.zip|this file]]. Currently only R2017b is supported. Everything else is unchanged regarding the configuration of MDCS and everything that worked before will still work.


== Running a GPU Job ==
== Running a GPU Job ==


Here is an example job script to test GPU use:
Here is an example job script to test GPU use:
% little example to te:st GPU computing with Matlab
<pre>
gpuinfo = gpuDevice;
% little example to te:st GPU computing with Matlab
gpuinfo = gpuDevice;


% example from https://blogs.mathworks.com/loren/2012/02/06/using-gpus-in-matlab/
% example from https://blogs.mathworks.com/loren/2012/02/06/using-gpus-in-matlab/


% first do a CPU FFT calculation
% first do a CPU FFT calculation
A1 = rand(3000,3000);
A1 = rand(3000,3000);
tic;
tic;
B1 = fft(A1);
B1 = fft(A1);
time1 = toc;
time1 = toc;
 
% now copy A1 to GPU and repeat FFT
% now copy A1 to GPU and repeat FFT
A2 = gpuArray(A1);
A2 = gpuArray(A1);
tic;
tic;
B2 = fft(A2);
B2 = fft(A2);
time2 = toc;  
time2 = toc;  
 
% calculate speedup of FFT (ignores overhead from data transfer time)
% calculate speedup of FFT (ignores overhead from data transfer time)
speedUp = time1/time2;
speedUp = time1/time2;
Copy this to a file named 'test_gpu.m'. Then
</pre>
Copy this to a file named 'test_gpu.m'. Then connect to the cluster with the commands:
sched = parcluster('CARL');
sched.AdditionalProperties.memory='8G';
sched.AdditionalProperties.partition='mpcg.p';
sched.AdditionalProperties.ngpus=1;
The last two commands will make sure your job is run on one of the gpu nodes (in the partition mpcg.p) and using 1 GPU (per node). Note that your job will never start if you request a partition without GPU nodes (such as the defaut carl.p) or if you ask for more than 2 GPUs per node. Next you can submit a job with
job = batch(sched, 'test_gpu');
which shows you the requested resources. After some time (you can monitor to the job also on the cluster) you should see with
job.State
that the job is finished. Load the job data with
jd = load(job)
after which you should see:
jd =
 
  struct with fields:
 
        A1: [3000×3000 double]
        A2: [1×1 gpuArray]
        B1: [3000×3000 double]
        B2: [1×1 gpuArray]
        ans: 'finished'
    gpuinfo: [1×1 parallel.gpu.CUDADevice]
        jd: [1×1 struct]
    speedUp: 0.1740
      time1: 0.1720
      time2: 0.9882
You will also see a few warnings which are to A2 and B2 (Matlab tries to locate these arrays on your local GPU which is of course not possible). You can ignore these warnings (to get rid if the them you need to remove A2 and B2 from memory before the end of the job script). The timing shows that the code is slower on the GPU mybe due to the size of the array (too small).
 
== Notes ===
* The example uses a serial job and it is recommended to do that for GPU computing. So a single Matlab process has access to a single GPU. Don't know if Matlab can also handle multiple GPUs
* In principle you could also use a pool of workers each accessing a GPU. However, with e.g. 7 workers you would need to make sure that they all run on separate nodes to have access to their own GPU. Don't know if Matlab knows how to share GPUs between different workers.

Latest revision as of 15:28, 25 January 2018

This is experimemtal currently

SLURM Integration

In order to use GPUs within a Matlab worker you need to use a slightly modified client configuration by replacing the files from the SLURM integration with the files from this file. Currently only R2017b is supported. Everything else is unchanged regarding the configuration of MDCS and everything that worked before will still work.

Running a GPU Job

Here is an example job script to test GPU use:

% little example to te:st GPU computing with Matlab
gpuinfo = gpuDevice;

% example from https://blogs.mathworks.com/loren/2012/02/06/using-gpus-in-matlab/

% first do a CPU FFT calculation
A1 = rand(3000,3000);
tic;
B1 = fft(A1);
time1 = toc;

% now copy A1 to GPU and repeat FFT
A2 = gpuArray(A1);
tic;
B2 = fft(A2);
time2 = toc; 

% calculate speedup of FFT (ignores overhead from data transfer time)
speedUp = time1/time2;

Copy this to a file named 'test_gpu.m'. Then connect to the cluster with the commands:

sched = parcluster('CARL');
sched.AdditionalProperties.memory='8G';
sched.AdditionalProperties.partition='mpcg.p';
sched.AdditionalProperties.ngpus=1;

The last two commands will make sure your job is run on one of the gpu nodes (in the partition mpcg.p) and using 1 GPU (per node). Note that your job will never start if you request a partition without GPU nodes (such as the defaut carl.p) or if you ask for more than 2 GPUs per node. Next you can submit a job with

job = batch(sched, 'test_gpu');

which shows you the requested resources. After some time (you can monitor to the job also on the cluster) you should see with

job.State

that the job is finished. Load the job data with

jd = load(job)

after which you should see:

jd = 
 struct with fields:
        A1: [3000×3000 double]
        A2: [1×1 gpuArray]
        B1: [3000×3000 double]
        B2: [1×1 gpuArray]
       ans: 'finished'
   gpuinfo: [1×1 parallel.gpu.CUDADevice]
        jd: [1×1 struct]
   speedUp: 0.1740
     time1: 0.1720
     time2: 0.9882

You will also see a few warnings which are to A2 and B2 (Matlab tries to locate these arrays on your local GPU which is of course not possible). You can ignore these warnings (to get rid if the them you need to remove A2 and B2 from memory before the end of the job script). The timing shows that the code is slower on the GPU mybe due to the size of the array (too small).

Notes =

  • The example uses a serial job and it is recommended to do that for GPU computing. So a single Matlab process has access to a single GPU. Don't know if Matlab can also handle multiple GPUs
  • In principle you could also use a pool of workers each accessing a GPU. However, with e.g. 7 workers you would need to make sure that they all run on separate nodes to have access to their own GPU. Don't know if Matlab knows how to share GPUs between different workers.