Difference between revisions of "PALM"

From HPC users
Jump to navigationJump to search
Line 86: Line 86:
   <li> Usage debug symbols within the executable. This is necessary for most of the debuggers. For this you have to add the compiler option ''-g'' to the definition ''%fopts'' and    ''%lopts'' in the ''.mrun_config'' file. Maybe you have to reduce the optionization level (compiler options ''-O3'', ''-Ofast'', ''-align all'', ''-ftz'', ''-fno-alias'', ''-no-scalar-rep'', ''-no-prec-sqrt'', ''-ip'', ''-ipo'') to ''-O2'' to get the right output in the debugger. Don't forget to build the code again (see [[PALM#Known_issues | Known issues]]).
   <li> Usage debug symbols within the executable. This is necessary for most of the debuggers. For this you have to add the compiler option ''-g'' to the definition ''%fopts'' and    ''%lopts'' in the ''.mrun_config'' file. Maybe you have to reduce the optionization level (compiler options ''-O3'', ''-Ofast'', ''-align all'', ''-ftz'', ''-fno-alias'', ''-no-scalar-rep'', ''-no-prec-sqrt'', ''-ip'', ''-ipo'') to ''-O2'' to get the right output in the debugger. Don't forget to build the code again (see [[PALM#Known_issues | Known issues]]).
   </li>
   </li>
   <li> To enable additional checks (e.g. array bounds) during the runtime please add the compiler option ''-check'' to the definition ''%fopts'' and ''%lopts'' in the ''.mrun_config'' file. Note that the code will run slower. This option is only useful for debugging, not for normal runs. And don't forget to build the code again (see [[PALM#Known_issues | Known issues]]).
   <li> To enable additional checks (e.g. array bounds) during the runtime please add the compiler option ''-check'' to the definition ''%fopts'' and ''%lopts'' in the ''.mrun_config'' file. Note that the code will run slower. This option is only useful for debugging but not for normal runs. And don't forget to build the code again (see [[PALM#Known_issues | Known issues]]).
   </li>
   </li>
   <li> Usage of the debug tool ''valgrind''. This module enables different checks of the code (see [[valgrind]]), especially the check of invalid memory usage. To use this tool please do following steps:
   <li> Usage of the debug tool ''valgrind''. This module enables different checks of the code (see [[valgrind]]), especially the check of invalid memory usage. To use this tool please do following steps:

Revision as of 11:42, 12 February 2015

The software PALM is a large-eddy simulation (LES) model for atmospheric and oceanic flows developed at the Institute of Meteorology and Climatology of the Leibniz Universität Hannover.

Installation

Please follow the detailed instructions given in the following pdf-document:

SGE scripts

With recent PALM versions (revision 1100 or newer) PALM jobs are submitted from the local computer. SGE scripts will be generated automatically, so you don't need to create an SGE script by yourself.

If you use an older PALM version than revision 1100, a sample SGE script for submitting PALM jobs can be found here:

Please copy the sample script to your working directory (as palm.sge or <different-name>.sge). For carrying out the test run (to verify the installation), the script does not need to be modified. Please see the old installation guide for instructions on how to modify the script for different runs.

Submitting PALM jobs

PALM jobs are submitted from your local computer with the script mrun. A typical mrun call looks like this:

 mrun -z -d <job name> -h lcflow -K parallel -X <number of slots> -t <CPU time in s> -r "d3# <output file list>"

<output file list> can be one or several of the following strings (separated by blanks): "3d#" (3d data), "xy#", "xz#", "yz#" (cross sections), "ma#" (masked data), "pr#" (profiles), "ts#" (time series), "sp#" (spectra). If you want to restart jobs or use turbulent inflow, the output of binary data for restarts can be switched on by simply adding "restart" to the output file list. For a restart run, all "#" have to be replaced by "f". A run with turbulent inflow (which uses data of a precursor run for initialization) requires an "rec". Example: The mrun call for a run with turbulent inflow and desired output of 3d data, profiles and time series as well as binary data for possible restarts would look like this:

 mrun -z -d example2 -h lcflow -K parallel -X 144 -t 86400 -r "d3# rec 3d# pr# ts# restart"

In this case, the job "example2" will run on 144 slots (= 12 cores) for 24 hours.
By default, PALM jobs are submitted to the low-memory nodes of FLOW. If the simulation is very memory demanding (>1800 MB per slot), you can submit it to the high-memory nodes of FLOW by adding the mrun option:

 -m <memory in MB (>1800)>

Runtime estimation

The runtime of PALM (which is needed for the SGE script and for mrun) can be estimated by

where the constant is approximately

This value is a first guess from a sample of simulation data. However, this number might have to be corrected in the future. It depends on additional parameters as amount of output data and complexity of user-defined code.

The number of points is defined by the product of the grid points in x-, y- and z-direction

The number of iterations can be calculated by

with the physical simulation time and the timestep size . The timestep size can (in most cases) be estimated by the Courant-Friedrichs-Levy like criteria

where L and N are the length of the simulated domain and resolution in x-, y- and z-direction, respectively. The velocity is the maximal windspeed of the simulation.

Note: In the time estimation the scaling is assumed to be linear which is not true for large number of used CPU cores and small resolutions ( points/core). In this case the constant could be larger.

Known issues

  • When you've made changes in the .mrun_config don't forget to run mbuild once again after adjusting the scripts by
mbuild -u -h lcflow
mbuild -h lcflow
Usually you have delete the file MAKE_DEPOSITORY on the target system (e.g. FLOW).
  • With the Intel Compiler 12.0.0 the compiler flag -no-prec-div and -np-prec-sqrt can lead to different results for same runs. Please don't use these flags. Note that the flags will automatically be set when using the compiler option -fast. In this case you should set -prec-div and -prec-sqrt.
  • When submitting PALM jobs from your local computer, job-protocols are sometimes not transferred back to the local host via scp. In this case, they remain in the job_queue-folder on FLOW.

Debugging of PALM

Sometimes it is necessary to debug the code, especially when using an own user code. Here are some hints to debug PALM when running parallel

  1. The simplest way is to add print statements in the user code, at least in the beginning and at the end of each procedure. However, this method is in many cases not very useful.
  2. Usage debug symbols within the executable. This is necessary for most of the debuggers. For this you have to add the compiler option -g to the definition %fopts and %lopts in the .mrun_config file. Maybe you have to reduce the optionization level (compiler options -O3, -Ofast, -align all, -ftz, -fno-alias, -no-scalar-rep, -no-prec-sqrt, -ip, -ipo) to -O2 to get the right output in the debugger. Don't forget to build the code again (see Known issues).
  3. To enable additional checks (e.g. array bounds) during the runtime please add the compiler option -check to the definition %fopts and %lopts in the .mrun_config file. Note that the code will run slower. This option is only useful for debugging but not for normal runs. And don't forget to build the code again (see Known issues).
  4. Usage of the debug tool valgrind. This module enables different checks of the code (see valgrind), especially the check of invalid memory usage. To use this tool please do following steps:
    1. add valgrind to the definition %modules in the .mrun_config file.
    2. add compiler option -g (see above)
    3. modify the script mrun

      ....           
      elif [[ $host = lcflow ]]
      then
        mpirun -np $ii a.out  < runfile_atmos  $ROPTS
      elif ....
           


      to

      ....           
      elif [[ $host = lcflow ]]
      then
        mpirun -np $ii valgrind -v --leak-check=full --log-file="valgrind.out.%q{PMI_RANK}" a.out  < runfile_atmos  $ROPTS
      elif ....
           


      The runtime of the program heavily increases (factor 10 or more). The program valgrind will now write files valgrind.out.XX for each MPI process in the temporary working directory of PALM. Please don't forget to deploy the scripts again with mbuild -u -h lcflow

    4. start the job with mrun and the additional option -B to avoid deleting of the temporary working directory (and hence the output of valgrind).
    5. analyze the output of valgring (e.g. search for invalid write)

    Tutorials

    Here are slides from the last training at ForWind in April 2012.

    Day 1

    Day 2

    Day 3


    External Links