MATLAB Distributing Computing Server

From HPC users
Jump to navigationJump to search

Benefits of MDCS

With MDCS, you can submit both serial as well as parallel jobs to one of the central HPC clusters from within your local MATLAB session. There is no need to deal with SGE, or even to log-on to the HPC systems. The internal structure of the clusters is hidden to the end user, they merely act as "black box" which is connected to your local machine and provides you with powerful computational resources. On the other hand, jobs submitted via MDCS are fully integrated into SGE, and have the same rights and privileges like any other SGE batch job. Thus there is no conflict between MATLAB jobs submitted via MDCS and standard SGE batch jobs.

Using MDCS for MATLAB computations on the central HPC facilities has a number of advantages:

  • Simplified workflow for those who exclusively do their numerics with MATLAB: they can do development, production of results, and post-processing within a unified environment (the MATLAB desktop)
  • A "worker" (= MATLAB session without a user interface) does not check out any "regular" MATLAB license or, what is even more, any Toolbox license even if functions or utilities of the Toolbox are used by the worker; all Toolboxes to which the client which the job was submitted from has access to (regardless whether they are actually checked out by the client or not) can be used by the workers.
    Considering that the University has only 200 MATLAB licenses, but there are 224 MDCS worker licenses, it immediately becomes clear that the total number of MATLAB licenses for all users in the University is effectively more than doubled by MDCS! The effect is even more pronounced for the Toolboxes: e.g., there are only 50 licenses for the Statistics Toolbox, but with MDCS an additional "effective" 224 licenses for this Toolbox become available (analogous for all other toolboxes).

    To allow for a fair sharing of resources, the number of worker licenses a single user can check out at a given instance has been limited to 36 (this should be compared to the situation before MDCS was introduced: how often could one user get access to 36, say, Statistics Toolbox or Signal Processing Toolbox licenses at a time?)
  • Calculations with more than 12 workers, significant peed-up by parfor loops and similar tools

Therefore, all MATLAB computations on the clusters should by default be done via MDCS. Of course, it is still possible to submit MATLAB Jobs as regular SGE jobs (by writing the command that starts MATLAB in non-interactive mode into the submission script and supplying the necessary input arguments and files), but this strategy is deprecated. It leads, due to the limited number of available MATLAB and Toolbox licenses, to a strong competition between HPC users and other MATLAB users across the University. Moreover, cluster jobs often fail immediately after they have started to run since there are no free licenses available (a reliable, full license integration with SGE is non-trivial to implement). With MDCS, this license availability problem do not exist since SGE keeps track of the total number of workers currently used, and if the required resources are not available, the job just stays in the queue like any other batch job.

Prerequisites

  1. On your local machine (PC, workstation, notebook, ...), you must have one of the following MATLAB releases installed:
    R2010b
    R2011a
    R2011b
    Unfortunately, more recent releases of MDCS are currently not available. It is important that your local installation includes all toolboxes, in particular the Parallel Computing Toolbox (PCT). It is recommended to install MATLAB from the media provided by the IT Services on their website. In that case, all required components (including the PCT) are automatically installed.
  2. You must install a couple of files into the directory
    matlabroot/toolbox/local
    (on a Unix/Linux system, analogously for Windows clients)
  3. After these preparations, start a new MATLAB session and bring up the Parallel Configurations Manager by selecting Parallel -> Manage Configurations. Create a new "Generic Scheduler" Configuration (File -> New -> Generic):

    GenericSchedulerConfig.jpg

    The following fields and boxes must be filled out:

  • Name



After these preliminaries, you are prepared for exploring the rich possibilities of distributed and parallel computing with MATLAB!

Basic MDCS usage

Typical example of an "embarrassingly parallel" problem ("task-parallel" job in MATLAB terminology): parameter sweep of a 2nd-order ODE (damped Harmonic Oscillator).

ODE defined in odesystem.m.

Parameter sweep in param_Sweep_batch.m. Independent loop iterations automatically get distributed across 16 workers:

 job = batch('paramSweep_batch', 'matlabpool', 15, 'FileDependencies', {'odesystem.m'});

Check state by 'Job Monitor or in command window:

 job.State;

Analyze results:

 load(job);

Runtime:

 job.t1

Visualization:

 figure;
 f=surf(jobData.bVals, jobData.kVals, jobData.peakVals);
 set(f,'LineStyle','none');
 set(f,'FaceAlpha',0.5);
 xlabel('Damping'); ylabel('Stiffness'); zlabel('Peak Displacement');
 view(50, 30);

Clean up

delete(job);


All files can be downloaded from ...

Advanced usage: Specifying resources

Old (non-MDCS) MATLAB usage

To submit a MATLAB job, you must first load the environment module in your submission script:

module load matlab

This automatically loads the newest version, if several versions are installed. After that, invoke MATLAB in batch mode:

matlab -nosplash -nodesktop -r mymatlab_input

where mymatlab_input.m (a so-called ".m-file") is an ASCII text file containing the sequence of MATLAB commands that you would normally enter in an interactive session.

Slides and links from last MATLAB workshop at the University of Oldenburg (19.02.2013)

Slides

Links


Documentation