MDCS with GPUs
In order to use GPUs within a Matlab worker you need to use a slightly modified client configuration by replacing the files from the SLURM integration with the files from this file. Currently only R2017b is supported. Everything else is unchanged regarding the configuration of MDCS and everything that worked before will still work.
Running a GPU Job
Here is an example job script to test GPU use:
% little example to te:st GPU computing with Matlab gpuinfo = gpuDevice; % example from https://blogs.mathworks.com/loren/2012/02/06/using-gpus-in-matlab/ % first do a CPU FFT calculation A1 = rand(3000,3000); tic; B1 = fft(A1); time1 = toc; % now copy A1 to GPU and repeat FFT A2 = gpuArray(A1); tic; B2 = fft(A2); time2 = toc; % calculate speedup of FFT (ignores overhead from data transfer time) speedUp = time1/time2;
Copy this to a file named 'test_gpu.m'. Then connect to the cluster with the commands:
sched = parcluster('CARL'); sched.AdditionalProperties.memory='8G'; sched.AdditionalProperties.partition='mpcg.p'; sched.AdditionalProperties.ngpus=1;
The last two commands will make sure your job is run on one of the gpu nodes (in the partition mpcg.p) and using 1 GPU (per node). Note that your job will never start if you request a partition without GPU nodes (such as the defaut carl.p) or if you ask for more than 2 GPUs per node. Next you can submit a job with
job = batch(sched, 'test_gpu');
which shows you the requested resources. After some time (you can monitor to the job also on the cluster) you should see with
that the job is finished. Load the job data with
jd = load(job)
after which you should see:
struct with fields:
A1: [3000×3000 double] A2: [1×1 gpuArray] B1: [3000×3000 double] B2: [1×1 gpuArray] ans: 'finished' gpuinfo: [1×1 parallel.gpu.CUDADevice] jd: [1×1 struct] speedUp: 0.1740 time1: 0.1720 time2: 0.9882
You will also see a few warnings which are to A2 and B2 (Matlab tries to locate these arrays on your local GPU which is of course not possible). You can ignore these warnings (to get rid if the them you need to remove A2 and B2 from memory before the end of the job script). The timing shows that the code is slower on the GPU mybe due to the size of the array (too small).
- The example uses a serial job and it is recommended to do that for GPU computing. So a single Matlab process has access to a single GPU. Don't know if Matlab can also handle multiple GPUs
- In principle you could also use a pool of workers each accessing a GPU. However, with e.g. 7 workers you would need to make sure that they all run on separate nodes to have access to their own GPU. Don't know if Matlab knows how to share GPUs between different workers.