Difference between revisions of "HPC Tutorial Part3: Examples"

From HPC users
Jump to navigationJump to search
Line 53: Line 53:
pi_estim = 3.151800
pi_estim = 3.151800
   </nowiki>
   </nowiki>
('''NOTE:''' Clarification of the content of the remaining logging files can be found [[ |here]])


After the job has been completed, its history can be reconstructed by filtering the accounting file by means of the query
After the job has been completed, its history can be reconstructed by filtering the accounting file by means of the query

Revision as of 15:16, 25 November 2013

Distributed memory example: Intel mpi

As an example that builds upon the distributed memory paradigm consider the example code in directory estimPi_intelMpi. In order to compile the program and to submit the computing request we might e.g. use the intel mpi library. Therefore we need to customize our user environment by loading the proper modules via

 module load intel/impi

note that this call of the module command will of course load the most recent version of the corresponding library. To be more precise, given the current configuration of the system, hosting the following list if intel/impi modules

 
---------------------- /cm/shared/modulefiles -----------------------
intel/impi/32/4.0.1.007 intel/impi/4.1.0.024/64
intel/impi/32/4.1.0.024 intel/impi/4.1.1.036/32
intel/impi/4.0.1.007/32 intel/impi/4.1.1.036/64
intel/impi/4.0.1.007/64 intel/impi/64/4.0.1.007
intel/impi/4.1.0.024/32 intel/impi/64/4.1.0.024
  

the above call will be equivalent to

 module load intel/impi/64/4.1.0.024

Once the proper module is loaded, the program main_pi_openMpi.c can be compiled by means of

  mpicc main_pi_openMpi.c -o main_pi_openMpi -lm

Further, the computing request might be submitted using the submission script submissionScript.sge:

  qsub submissionScript_pi_impi.sge 

(NOTE: Be aware that the submission script needs to load the same intel/impi module that was used for compiling the program) After the job starts to run, its status can be monitored via the qstat</command>. In this question the commandline querey

 qstat -g t

yields a precise list of the execution hosts on which slots for the job were allocated:

 
job-ID  prior   name       user         state submit/start at     queue                  master ja-task-ID 
----------------------------------------------------------------------------------------------------------
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs004 MASTER        
                                                                  mpc_std_shrt.q@mpcs004 SLAVE         
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs006 SLAVE         
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs010 SLAVE         
  

After successful termination of the job, the log-file estimPi_mpi.o1104533 contains the result of the simulation:

 
> cat estimPi_mpi.o1104533
number of processes = 3
number of trials = 100000
pi_estim = 3.151800
  

(NOTE: Clarification of the content of the remaining logging files can be found [[ |here]])

After the job has been completed, its history can be reconstructed by filtering the accounting file by means of the query

 
> qacct -j 1104533
==============================================================
qname        mpc_std_shrt.q      
hostname     mpcs004.mpinet.cluster
group        ifp                 
owner        alxo9476            
project      NONE                
department   defaultdepartment   
jobname      estimPi_mpi         
jobnumber    1104533             
taskid       undefined
account      sge                 
priority     0                   
qsub_time    Mon Nov 25 15:01:10 2013
start_time   Mon Nov 25 15:01:49 2013
end_time     Mon Nov 25 15:01:52 2013
granted_pe   impi41              
slots        3                   
failed       0    
exit_status  0                   
ru_wallclock 3            
ru_utime     0.137        
ru_stime     0.189        
ru_maxrss    13400               
ru_ixrss     0                   
ru_ismrss    0                   
ru_idrss     0                   
ru_isrss     0                   
ru_minflt    32183               
ru_majflt    83                  
ru_nswap     0                   
ru_inblock   0                   
ru_oublock   0                   
ru_msgsnd    0                   
ru_msgrcv    0                   
ru_nsignals  0                   
ru_nvcsw     4711                
ru_nivcsw    308                 
cpu          0.326        
mem          0.001             
io           0.001             
iow          0.000             
maxvmem      404.328M
arid         undefined
  

Shared memory example: Open mp