R 2016

From HPC users
Jump to navigationJump to search

Introduction

R is a free software environment for statistical computing and graphics.

Using R on the HPC cluster

If you want to use R on the HPC cluster, you will have to load its module. You can do that by using the command

module load R

Since there is only one version of R installed, you dont need to specify a version. If you use the command

module spider R

you will find more informations about the module.

Usage of R and MPI

For parallelization the packages doMPI and Rmpi are installed. To launch an parallel R script inside a SLURM script please use command line

  mpirun -np $NSLOTS R --slave -f SCRIPTNAME SCRIPT_CMDLINE_OPTIONS

to enable SGE to control all processes of your script. Please do not use the batch starting sequence R CMD BATCH.

The corresponding parallel environment in the SGE submission script is specified by

 #$ -pe impi NUMBER_OF_CORES
 #$ -R y

Usage of NetCDF and R

A package for NetCDF has been installed together with R. In order to use it, please add the command

module load netCDF

to your job script before starting R. Your R-script should include a line

library(ncdf)

to load the NetCDF library. Please refer to the documentations of NetCDF and R for more informations.

Installed version

The currently installed version of R is 3.3.1.

Additional installed packages

The R release contains a lot of additional packages. After loading and starting R ("module load R" and simply "R" on the command line), you can generate a list of all of them by using the following commands

ip <- as.data.frame(installed.packages()[,c(1,3:4)])
rownames(ip) <- NULL
ip <- ip[is.na(ip$Priority),1:2,drop=FALSE]
print(ip, row.names=FALSE)

You will receive a list of every package and its related version. It should look like this:

       Package     Version
           abc         2.1
      abc.data         1.0
         abind       1.4-3
       acepack     1.3-3.3
        adabag         4.1

Documentation

You can look up anything about R on their