HPC Tutorial Part3: Examples

From HPC users
Jump to navigationJump to search

Distributed memory example: Intel mpi

As an example that builds upon the distributed memory paradigm consider the example code in directory estimPi_intelMpi. In order to compile the program and to submit the computing request we might e.g. use the intel mpi library. Therefore we need to customize our user environment by loading the proper modules via

 module load intel/impi

note that this call of the module command will of course load the most recent version of the corresponding library. To be more precise, given the current configuration of the system, hosting the following list if intel/impi modules

 
---------------------- /cm/shared/modulefiles -----------------------
intel/impi/32/4.0.1.007 intel/impi/4.1.0.024/64
intel/impi/32/4.1.0.024 intel/impi/4.1.1.036/32
intel/impi/4.0.1.007/32 intel/impi/4.1.1.036/64
intel/impi/4.0.1.007/64 intel/impi/64/4.0.1.007
intel/impi/4.1.0.024/32 intel/impi/64/4.1.0.024
  

the above call will be equivalent to

 module load intel/impi/64/4.1.0.024

Once the proper module is loaded, the program main_pi_openMpi.c can be compiled by means of

  mpicc main_pi_openMpi.c -o main_pi_openMpi -lm

Further, the computing request might be submitted using the submission script submissionScript.sge:

  qsub submissionScript_pi_impi.sge 

(NOTE: Be aware that the submission script needs to load the same intel/impi module that was used for compiling the program) After the job starts to run, its status can be monitored via the qstat command. In this question the commandline querey

 qstat -g t

yields a precise list of the execution hosts on which slots for the job were allocated:

 
job-ID  prior   name       user         state submit/start at     queue                  master ja-task-ID 
----------------------------------------------------------------------------------------------------------
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs004 MASTER        
                                                                  mpc_std_shrt.q@mpcs004 SLAVE         
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs006 SLAVE         
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs010 SLAVE         
  

After successful termination of the job, the log-file estimPi_mpi.o1104533 contains the result of the simulation:

 
> cat estimPi_mpi.o1104533
number of processes = 3
number of trials = 100000
pi_estim = 3.151800
  

(NOTE: Clarification of the content of the remaining logging files can be found here)

After the job has been completed, its history can be reconstructed by filtering the accounting file by means of the query

 
> qacct -j 1104533
==============================================================
qname        mpc_std_shrt.q      
hostname     mpcs004.mpinet.cluster
group        ifp                 
owner        alxo9476            
project      NONE                
department   defaultdepartment   
jobname      estimPi_mpi         
jobnumber    1104533             
taskid       undefined
account      sge                 
priority     0                   
qsub_time    Mon Nov 25 15:01:10 2013
start_time   Mon Nov 25 15:01:49 2013
end_time     Mon Nov 25 15:01:52 2013
granted_pe   impi41              
slots        3                   
failed       0    
exit_status  0                   
ru_wallclock 3            
ru_utime     0.137        
ru_stime     0.189        
ru_maxrss    13400               
ru_ixrss     0                   
ru_ismrss    0                   
ru_idrss     0                   
ru_isrss     0                   
ru_minflt    32183               
ru_majflt    83                  
ru_nswap     0                   
ru_inblock   0                   
ru_oublock   0                   
ru_msgsnd    0                   
ru_msgrcv    0                   
ru_nsignals  0                   
ru_nvcsw     4711                
ru_nivcsw    308                 
cpu          0.326        
mem          0.001             
io           0.001             
iow          0.000             
maxvmem      404.328M
arid         undefined
  

Shared memory example: Open mp

For the shared memory example an ordinary compiler, as e.g. gcc, will do. To make sure the up to date version of the compiler is loaded you might issue the command

 module load gcc

Currently this is equivalent to calling

 module load  gcc/4.7.1

After this, the program can be compiled via

 gcc -fopenmp main_pi_openMp.c -o main_pi_openMp

and submitted using

 qsub submissionScript_estimPi_openMp.sge

After successful termination of the job, the log-file contains the result of the simulation:

 
> cat estimPi_openmp.o1104535 
number of threads = 3
number of trials = 100000
pi_estim = 3.14396
  

Again, using the command qacct the history of the job can be reconstructed:

 
> qacct -j 1104535
==============================================================
qname        mpc_std_shrt.q      
hostname     mpcs007.mpinet.cluster
group        ifp                 
owner        alxo9476            
project      NONE                
department   defaultdepartment   
jobname      estimPi_openmp      
jobnumber    1104535             
taskid       undefined
account      sge                 
priority     0                   
qsub_time    Mon Nov 25 15:23:44 2013
start_time   Mon Nov 25 15:45:22 2013
end_time     Mon Nov 25 15:45:23 2013
granted_pe   smp                 
slots        3                   
failed       0    
exit_status  0                   
ru_wallclock 1            
ru_utime     0.052        
ru_stime     0.031        
ru_maxrss    4156                
ru_ixrss     0                   
ru_ismrss    0                   
ru_idrss     0                   
ru_isrss     0                   
ru_minflt    8651                
ru_majflt    13                  
ru_nswap     0                   
ru_inblock   0                   
ru_oublock   0                   
ru_msgsnd    0                   
ru_msgrcv    0                   
ru_nsignals  0                   
ru_nvcsw     564                 
ru_nivcsw    50                  
cpu          0.083        
mem          0.000             
io           0.000             
iow          0.000             
maxvmem      180.750M