HPC Tutorial Part3: Examples
Distributed memory example: Intel mpi
As an example that builds upon the distributed memory paradigm consider the example code in directory estimPi_intelMpi. In order to compile the program and to submit the computing request we might e.g. use the intel mpi library. Therefore we need to customize our user environment by loading the proper modules via
module load intel/impi
note that this call of the module command will of course load the most recent version of the corresponding library. To be more precise, given the current configuration of the system, hosting the following list if intel/impi modules
---------------------- /cm/shared/modulefiles ----------------------- intel/impi/32/4.0.1.007 intel/impi/4.1.0.024/64 intel/impi/32/4.1.0.024 intel/impi/4.1.1.036/32 intel/impi/4.0.1.007/32 intel/impi/4.1.1.036/64 intel/impi/4.0.1.007/64 intel/impi/64/4.0.1.007 intel/impi/4.1.0.024/32 intel/impi/64/4.1.0.024
the above call will be equivalent to
module load intel/impi/64/4.1.0.024
Once the proper module is loaded, the program main_pi_openMpi.c can be compiled by means of
mpicc main_pi_openMpi.c -o main_pi_openMpi -lm
Further, the computing request might be submitted using the submission script submissionScript.sge:
qsub submissionScript_pi_impi.sge
(NOTE: Be aware that the submission script needs to load the same intel/impi module that was used for compiling the program) After the job starts to run, its status can be monitored via the qstat command. In this question the commandline querey
qstat -g t
yields a precise list of the execution hosts on which slots for the job were allocated:
job-ID prior name user state submit/start at queue master ja-task-ID ---------------------------------------------------------------------------------------------------------- 1104533 0.50549 estimPi_mp alxo9476 r 11/25/2013 15:01:48 mpc_std_shrt.q@mpcs004 MASTER mpc_std_shrt.q@mpcs004 SLAVE 1104533 0.50549 estimPi_mp alxo9476 r 11/25/2013 15:01:48 mpc_std_shrt.q@mpcs006 SLAVE 1104533 0.50549 estimPi_mp alxo9476 r 11/25/2013 15:01:48 mpc_std_shrt.q@mpcs010 SLAVE
After successful termination of the job, the log-file estimPi_mpi.o1104533 contains the result of the simulation:
> cat estimPi_mpi.o1104533 number of processes = 3 number of trials = 100000 pi_estim = 3.151800
(NOTE: Clarification of the content of the remaining logging files can be found here)
After the job has been completed, its history can be reconstructed by filtering the accounting file by means of the query
> qacct -j 1104533 ============================================================== qname mpc_std_shrt.q hostname mpcs004.mpinet.cluster group ifp owner alxo9476 project NONE department defaultdepartment jobname estimPi_mpi jobnumber 1104533 taskid undefined account sge priority 0 qsub_time Mon Nov 25 15:01:10 2013 start_time Mon Nov 25 15:01:49 2013 end_time Mon Nov 25 15:01:52 2013 granted_pe impi41 slots 3 failed 0 exit_status 0 ru_wallclock 3 ru_utime 0.137 ru_stime 0.189 ru_maxrss 13400 ru_ixrss 0 ru_ismrss 0 ru_idrss 0 ru_isrss 0 ru_minflt 32183 ru_majflt 83 ru_nswap 0 ru_inblock 0 ru_oublock 0 ru_msgsnd 0 ru_msgrcv 0 ru_nsignals 0 ru_nvcsw 4711 ru_nivcsw 308 cpu 0.326 mem 0.001 io 0.001 iow 0.000 maxvmem 404.328M arid undefined
For the shared memory example an ordinary compiler, as e.g. gcc, will do. To make sure the up to date version of the compiler is loaded you might issue the command
module load gcc
Currently this is equivalent to calling
module load gcc/4.7.1
After this, the program can be compiled via
gcc -fopenmp main_pi_openMp.c -o main_pi_openMp
and submitted using
qsub submissionScript_estimPi_openMp.sge
After successful termination of the job, the log-file contains the result of the simulation:
> cat estimPi_openmp.o1104535 number of threads = 3 number of trials = 100000 pi_estim = 3.14396
Again, using the command qacct the history of the job can be reconstructed:
> qacct -j 1104535 ============================================================== qname mpc_std_shrt.q hostname mpcs007.mpinet.cluster group ifp owner alxo9476 project NONE department defaultdepartment jobname estimPi_openmp jobnumber 1104535 taskid undefined account sge priority 0 qsub_time Mon Nov 25 15:23:44 2013 start_time Mon Nov 25 15:45:22 2013 end_time Mon Nov 25 15:45:23 2013 granted_pe smp slots 3 failed 0 exit_status 0 ru_wallclock 1 ru_utime 0.052 ru_stime 0.031 ru_maxrss 4156 ru_ixrss 0 ru_ismrss 0 ru_idrss 0 ru_isrss 0 ru_minflt 8651 ru_majflt 13 ru_nswap 0 ru_inblock 0 ru_oublock 0 ru_msgsnd 0 ru_msgrcv 0 ru_nsignals 0 ru_nvcsw 564 ru_nivcsw 50 cpu 0.083 mem 0.000 io 0.000 iow 0.000 maxvmem 180.750M