Difference between revisions of "HPC Tutorial Part3: Examples"

From HPC users
Jump to navigationJump to search
(Created page with "== Distributed memory example: Intel mpi == As an example that builds upon the distributed memory paradigm consider the example code in directory <tt>estimPi_intelMpi</tt>. I...")
 
Line 45: Line 45:
   </nowiki>
   </nowiki>


After successful terminal of the job, the log-file <tt>estimPi_mpi.o1104533</tt> contains the result of the simulation:
After successful termination of the job, the log-file <tt>estimPi_mpi.o1104533</tt> contains the result of the simulation:


   <nowiki>
   <nowiki>
Line 54: Line 54:
   </nowiki>
   </nowiki>


After the job has been completed, its history can be reconstructed by filtering the accounting file by means of the query


    
   <nowiki>
> qacct -j 1104533
==============================================================
qname        mpc_std_shrt.q     
hostname    mpcs004.mpinet.cluster
group        ifp               
owner        alxo9476           
project      NONE               
department  defaultdepartment 
jobname      estimPi_mpi       
jobnumber    1104533           
taskid      undefined
account      sge               
priority    0                 
qsub_time    Mon Nov 25 15:01:10 2013
start_time  Mon Nov 25 15:01:49 2013
end_time    Mon Nov 25 15:01:52 2013
granted_pe  impi41             
slots        3                 
failed      0   
exit_status  0                 
ru_wallclock 3           
ru_utime    0.137       
ru_stime    0.189       
ru_maxrss    13400             
ru_ixrss    0                 
ru_ismrss    0                 
ru_idrss    0                 
ru_isrss    0                 
ru_minflt    32183             
ru_majflt    83                 
ru_nswap    0                 
ru_inblock  0                 
ru_oublock  0                 
ru_msgsnd    0                 
ru_msgrcv    0                 
ru_nsignals  0                 
ru_nvcsw    4711               
ru_nivcsw    308               
cpu          0.326       
mem          0.001           
io          0.001           
iow          0.000           
maxvmem      404.328M
arid        undefined
  </nowiki>


== Shared memory example: Open mp ==
== Shared memory example: Open mp ==

Revision as of 16:15, 25 November 2013

Distributed memory example: Intel mpi

As an example that builds upon the distributed memory paradigm consider the example code in directory estimPi_intelMpi. In order to compile the program and to submit the computing request we might e.g. use the intel mpi library. Therefore we need to customize our user environment by loading the proper modules via

 module load intel/impi

note that this call of the module command will of course load the most recent version of the corresponding library. To be more precise, given the current configuration of the system, hosting the following list if intel/impi modules

 
---------------------- /cm/shared/modulefiles -----------------------
intel/impi/32/4.0.1.007 intel/impi/4.1.0.024/64
intel/impi/32/4.1.0.024 intel/impi/4.1.1.036/32
intel/impi/4.0.1.007/32 intel/impi/4.1.1.036/64
intel/impi/4.0.1.007/64 intel/impi/64/4.0.1.007
intel/impi/4.1.0.024/32 intel/impi/64/4.1.0.024
  

the above call will be equivalent to

 module load intel/impi/64/4.1.0.024

Once the proper module is loaded, the program main_pi_openMpi.c can be compiled by means of

  mpicc main_pi_openMpi.c -o main_pi_openMpi -lm

Further, the computing request might be submitted using the submission script submissionScript.sge:

  qsub submissionScript_pi_impi.sge 

(NOTE: Be aware that the submission script needs to load the same intel/impi module that was used for compiling the program) After the job starts to run, its status can be monitored via the qstat</command>. In this question the commandline querey

 qstat -g t

yields a precise list of the execution hosts on which slots for the job were allocated:

 
job-ID  prior   name       user         state submit/start at     queue                  master ja-task-ID 
----------------------------------------------------------------------------------------------------------
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs004 MASTER        
                                                                  mpc_std_shrt.q@mpcs004 SLAVE         
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs006 SLAVE         
1104533 0.50549 estimPi_mp alxo9476     r     11/25/2013 15:01:48 mpc_std_shrt.q@mpcs010 SLAVE         
  

After successful termination of the job, the log-file estimPi_mpi.o1104533 contains the result of the simulation:

 
> cat estimPi_mpi.o1104533
number of processes = 3
number of trials = 100000
pi_estim = 3.151800
  

After the job has been completed, its history can be reconstructed by filtering the accounting file by means of the query

 
> qacct -j 1104533
==============================================================
qname        mpc_std_shrt.q      
hostname     mpcs004.mpinet.cluster
group        ifp                 
owner        alxo9476            
project      NONE                
department   defaultdepartment   
jobname      estimPi_mpi         
jobnumber    1104533             
taskid       undefined
account      sge                 
priority     0                   
qsub_time    Mon Nov 25 15:01:10 2013
start_time   Mon Nov 25 15:01:49 2013
end_time     Mon Nov 25 15:01:52 2013
granted_pe   impi41              
slots        3                   
failed       0    
exit_status  0                   
ru_wallclock 3            
ru_utime     0.137        
ru_stime     0.189        
ru_maxrss    13400               
ru_ixrss     0                   
ru_ismrss    0                   
ru_idrss     0                   
ru_isrss     0                   
ru_minflt    32183               
ru_majflt    83                  
ru_nswap     0                   
ru_inblock   0                   
ru_oublock   0                   
ru_msgsnd    0                   
ru_msgrcv    0                   
ru_nsignals  0                   
ru_nvcsw     4711                
ru_nivcsw    308                 
cpu          0.326        
mem          0.001             
io           0.001             
iow          0.000             
maxvmem      404.328M
arid         undefined
  

Shared memory example: Open mp