Difference between revisions of "ORCA 2016"

From HPC users
Jump to navigationJump to search
 
(57 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== Introduction ==
== Introduction ==


The program ORCA is a modern electronic structure program package written by F. Neese, with contributions from many current and former coworkers and several collaborating groups. The binaries of ORCA are available free of charge for academic users for a variety of platforms. ORCA is a flexible, efficient and easy-to-use general purpose tool for quantum chemistry with specific emphasis on spectroscopic properties of open-shell molecules. It features a wide variety of standard quantum chemical methods ranging from semiempirical methods to DFT to single- and multireference correlated ab initio methods. It can also treat environmental and relativistic effects. Due to the user-friendly style, ORCA is considered to be a helpful tool not only for computational chemists, but also for chemists, physicists and biologists that are interested in developing the full information content of their experimental data with help of calculations. ORCA is able to carry out geometry optimizations and to predict a large number of spectroscopic parameters at different levels of theory. Besides the use of Hartee Fock theory, density functional theory (DFT) and semiempirical methods, high level ab initio quantum chemical methods, based on the configuration interaction and coupled cluster methods, are included into ORCA to an increasing degree.
The program ORCA is a modern electronic structure program package that is able to carry out geometry optimizations and to predict a large number of spectroscopic parameters at different levels of theory. Besides the use of Hartee Fock theory, density functional theory (DFT) and semiempirical methods, high level ab initio quantum chemical methods, based on the configuration interaction and coupled cluster methods, are included into ORCA to an increasing degree.
 
For more details please refer the [https://orcaforum.cec.mpg.de/ offical home] of ORCA where you can also find a thorough [https://orcaforum.cec.mpg.de/OrcaManual.pdf documentation] on using the program. Note that ORCA is free of charge for non-commercial use and by using ORCA on the cluster you are accepting the [https://orcaforum.cec.mpg.de/license.html ORCA license]. In particular, any scientific work using ORCA should at least cite
F. Neese: The ORCA program system (WIREs Comput Mol Sci 2012, 2: 73-78)
as well as other related works as apropriate.
 
Below, a short introduction to using ORCA on the cluster is given.


== Installed version ==
== Installed version ==


The currently installed version of ORCA is 3.0.3.
These versions are installed and and currently available...
 
... on envirnoment ''hpc-uniol-env'':
'''ORCA/3.0.3'''
'''ORCA/4.0.0'''
... on environment ''hpc-env/6.4'':
'''ORCA/4.0.1.2'''
 
== Using ORCA (version 3.0.3) on the HPC cluster ==
 
Depending on the environment you want to work on, you have to change to the corresponding environment and load the module.
 
As an example for the newest version, you have to type in as follows:
module load hpc-env/6.4
module load ORCA/4.0.1.2
 


== How to work with ORCA ==
Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by [[SLURM Job Management (Queueing) System|SLURM]].


You have to prepare a jobscript and an inputfile for your ORCA job.
For this reason, you have to create a jobscript for your tasks.
An example ORCA-job could look like this:


'''Note:''' The following examples will most likely only work with ORCA in version 3.0.3. You might need to change some commands to make them work in version 4.0 and above.
=== Serial run ===
#!/bin/bash
#SBATCH --partition=carl.p
  #SBATCH --time=1-00:00:00
  #SBATCH --time=1-00:00:00
  #SBATCH --mem=2G  
  #SBATCH --mem=2G  
  #SBATCH --job-name ORCA-TEST
  #SBATCH --job-name ORCA-SERIAL-TEST
  #SBATCH --output=slurm-%j.out
  #SBATCH --output=orca-serial-test-%j.out
  #SBATCH --error=slurm-%j.err  
  #SBATCH --error=orca-serial-test-%j.err  
  module load ORCA
module load hpc-uniol-env
  module load ORCA/3.0.3
  MODEL=TiF3
  MODEL=TiF3
  ORCAEXE=`which orca`
   
$ORCAEXE ${MODEL}.inp > ${MODEL}.out
ORCAEXE=$(which orca)          # full path to ORCA executable
INPUTEXT="inp xyz"            # extensions of input files
OUTPUTEXT="gbw prop"          # extensions of output files
#preparing $TMPDIR for run by copying file
for ext in $INPUTEXT
do
    if [ -e $MODEL.$ext ]
    then
      echo "Copying $MODEL.$ext to TMPDIR"
      cp $MODEL.$ext $TMPDIR/${MODEL}_${SLURM_JOB_ID}.$ext
    fi
done
#change to $TMPDIR for running ORCA
cd $TMPDIR
 
#run ORCA
$ORCAEXE ${MODEL}_${SLURM_JOB_ID}.inp > ${SLURM_SUBMIT_DIR}/${MODEL}_${SLURM_JOB_ID}.out
 
#saving files from $TMPDIR
for ext in $OUTPUTEXT
do
    if [ -e ${MODEL}_${JOB_ID}.$ext ]
    then
      echo "Copying $MODEL.$ext to ${SLURM_SUBMIT_DIR}"
      cp ${MODEL}_${JOB_ID}.$ext ${SLURM_SUBMIT_DIR}
    fi
done
 
The job script requires additional input files for ORCA, in this case <tt>[[media:TiF3.inp.gz|TiF3.inp]]</tt> and <tt>[[media:TiF3.xyz.gz|TiF3.xyz]]</tt> and all three files have to be placed in the same directory. '''Note:''' all downloads have to be unzipped first.
 
Once the job script and your input files are ready, a job can be submitted as usual with the command:
sbatch orca_serial_test.job
The job script works roughly in the following way
# the ORCA module is loaded and the name of the model is set (must be identical to the name of the <tt>.inp</tt> file)
# all input files (identified by the model name and given extensions) are copied to <tt>$TMPDIR</tt>, more files can be included by adding their extensions to the variable <tt>INPUTEXT</tt>
# the directory is changed to <tt>$TMPDIR</tt> and the run is started, a log file for the run (extension .out) is written to the directory from where the job was submitted
# all other files are created in <tt>$TMPDIR</tt>, which is automatically deleted after the job; if additional files need to be saved, they need to be copied (not yet implemented in the job script)
 
=== Parallel run ===
 
'''Note:''' Running the parallel example with the given jobscript and input file will take about 3.5 hours!
 
If you have managed to run a serial ORCA job on the cluster, you are already pretty close to know how to run a parallel ORCA job. You just have to change the following parts of your jobscript:
# add "'''#SBATCH --ntasks=16'''" (We've used 16 cores for the test file, you might lower or higher the amount to match the needs of your actual job) to your jobscript file.
# change the "MODEL"-variable in the jobscript file. For this example you have to change the variable to "'''silole_rad_zora_epr_pbe0'''".
# add the following lines of code to your jobscript:
<pre>
SETNPROCS=`echo "%pal nprocs $NSLOTS"`
OPAL=`grep %pal $MODEL.inp`
sed -i "/^%pal/c$SETNPROCS" $MODEL.inp
NPAL=`grep %pal $MODEL.inp`
echo "changed $OPAL to $NPAL in $MODEL.inp"
</pre>
::These lines of code will change the number of available slots to the actual amount specified abouve (at "'''--ntasks=XX'''") in your input file.
After applying these changes your jobscript should look like this:
<pre>
#!/bin/bash
 
#SBATCH --partition=carl.p
#SBATCH --time=0-05:00:00
#SBATCH --ntasks=32
#SBATCH --mem=4G
#SBATCH --job-name ORCA-PARALLEL-TEST
#SBATCH --output=orca-parallel-test-%j.out
#SBATCH --error=orca-parallel-test-%j.err
 
module load ORCA/3.0.3
 
MODEL=silole_rad_zora_epr_pbe0
 
ORCAEXE=`which orca`
INPUTEXT="inp xyz"
$ORCAEXE ${MODEL}.inp > ${MODEL}.out
 
#preparing $TMPDIR for run by copying file
for ext in $INPUTEXT
do
  if [ -e $MODEL.$ext ]
  then
      echo "Copying $MODEL.$ext to TMPDIR"
      cp $MODEL.$ext $TMPDIR
  fi
done
 
#change to $TMPDIR for running ORCA
cd $TMPDIR
 
# modify inputfile to match the number of available slots
SETNPROCS=`echo "%pal nprocs $NSLOTS"`
OPAL=`grep %pal $MODEL.inp`
sed -i "/^%pal/c$SETNPROCS" $MODEL.inp
NPAL=`grep %pal $MODEL.inp`
echo "changed $OPAL to $NPAL in $MODEL.inp"
 
#run ORCA
$ORCAEXE ${MODEL}.inp > $WORK/${MODEL}.out
 
#saving files from $TMPDIR
for ext in $OUTPUTEXT
do
  if [ -e ${MODEL}_${JOB_ID}.$ext ]
  then
      echo "Copying $MODEL.$ext to $WORK"
      cp ${MODEL}_${JOB_ID}.$ext $WORK
  fi
done
 
# copy file machines to the name ORCA expects
if [ -e machines ]
then
    cp machines $MODEL.nodes
fi
</pre>
If you jobscript is ready, you will need to download the following file: [[media:silole_rad_zora_epr_pbe0.inp.gz|silole_rad_zora_epr_pbe0.inp]] (You have to unzip this file before you can use it). This will be your input file. Place in the same folder as your jobscript.
 
You can now submit your job with the command
sbatch orca-parallel-test.job (Replace this with your filename if you have changed it!)
 
=== Troubleshooting ===
 
In case of problems the following hints may help you to identify the cause:
# check the log files from the SLURM (you probably specified the output/error file name in your jobscript, so the filename would be ''<your_filename>.out'' or ''<your_filename>.err'') as well as the ORCA log file (<model>.out for error messages.
# check the exit status of the job by using
sacct -j <job-id>
::Further informations about using the command '''sacct''' can be found [[Information on used Resources| here.]]
 
If you need help to identify the problem you can contact the {{sc}}. Please include the job-id in your request.
 
== Documentation ==


The first 5 lines are meant for SLURM. In the first line configuring the time needed for the job. Please be as accurate as possible but keep in mind that the job will fail if, e.g. the time is set to 1 hour but the job actually needs 5 hours. Second line defines the memory. For this job 2G (2Gb) should be enough. After that, we need to name our job, our example job is named "ORCA-TEST". The following two lines are defining the output- and the errorfile. Without adding these two lines, the output and errors will be saved into a single file.
The full documentation of the most recent version of ORCA (currently 4.0.0) can be found [https://orcaforum.cec.mpg.de/OrcaManual.pdf here] (PDF viewer required).

Latest revision as of 16:21, 5 May 2020

Introduction

The program ORCA is a modern electronic structure program package that is able to carry out geometry optimizations and to predict a large number of spectroscopic parameters at different levels of theory. Besides the use of Hartee Fock theory, density functional theory (DFT) and semiempirical methods, high level ab initio quantum chemical methods, based on the configuration interaction and coupled cluster methods, are included into ORCA to an increasing degree.

For more details please refer the offical home of ORCA where you can also find a thorough documentation on using the program. Note that ORCA is free of charge for non-commercial use and by using ORCA on the cluster you are accepting the ORCA license. In particular, any scientific work using ORCA should at least cite

F. Neese: The ORCA program system (WIREs Comput Mol Sci 2012, 2: 73-78)

as well as other related works as apropriate.

Below, a short introduction to using ORCA on the cluster is given.

Installed version

These versions are installed and and currently available...

... on envirnoment hpc-uniol-env:

ORCA/3.0.3
ORCA/4.0.0

... on environment hpc-env/6.4:

ORCA/4.0.1.2

Using ORCA (version 3.0.3) on the HPC cluster

Depending on the environment you want to work on, you have to change to the corresponding environment and load the module.

As an example for the newest version, you have to type in as follows:

module load hpc-env/6.4
module load ORCA/4.0.1.2


Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by SLURM.

For this reason, you have to create a jobscript for your tasks.

Note: The following examples will most likely only work with ORCA in version 3.0.3. You might need to change some commands to make them work in version 4.0 and above.

Serial run

#!/bin/bash 

#SBATCH --partition=carl.p
#SBATCH --time=1-00:00:00
#SBATCH --mem=2G 
#SBATCH --job-name ORCA-SERIAL-TEST
#SBATCH --output=orca-serial-test-%j.out
#SBATCH --error=orca-serial-test-%j.err 

module load hpc-uniol-env
module load ORCA/3.0.3

MODEL=TiF3

ORCAEXE=$(which orca)          # full path to ORCA executable
INPUTEXT="inp xyz"             # extensions of input files
OUTPUTEXT="gbw prop"           # extensions of output files

#preparing $TMPDIR for run by copying file
for ext in $INPUTEXT
do
   if [ -e $MODEL.$ext ]
   then
      echo "Copying $MODEL.$ext to TMPDIR"
      cp $MODEL.$ext $TMPDIR/${MODEL}_${SLURM_JOB_ID}.$ext
   fi
done

#change to $TMPDIR for running ORCA
cd $TMPDIR
 
#run ORCA
$ORCAEXE ${MODEL}_${SLURM_JOB_ID}.inp > ${SLURM_SUBMIT_DIR}/${MODEL}_${SLURM_JOB_ID}.out
 
#saving files from $TMPDIR
for ext in $OUTPUTEXT
do
   if [ -e ${MODEL}_${JOB_ID}.$ext ]
   then
      echo "Copying $MODEL.$ext to ${SLURM_SUBMIT_DIR}"
      cp ${MODEL}_${JOB_ID}.$ext ${SLURM_SUBMIT_DIR}
   fi
done

The job script requires additional input files for ORCA, in this case TiF3.inp and TiF3.xyz and all three files have to be placed in the same directory. Note: all downloads have to be unzipped first.

Once the job script and your input files are ready, a job can be submitted as usual with the command:

sbatch orca_serial_test.job

The job script works roughly in the following way

  1. the ORCA module is loaded and the name of the model is set (must be identical to the name of the .inp file)
  2. all input files (identified by the model name and given extensions) are copied to $TMPDIR, more files can be included by adding their extensions to the variable INPUTEXT
  3. the directory is changed to $TMPDIR and the run is started, a log file for the run (extension .out) is written to the directory from where the job was submitted
  4. all other files are created in $TMPDIR, which is automatically deleted after the job; if additional files need to be saved, they need to be copied (not yet implemented in the job script)

Parallel run

Note: Running the parallel example with the given jobscript and input file will take about 3.5 hours!

If you have managed to run a serial ORCA job on the cluster, you are already pretty close to know how to run a parallel ORCA job. You just have to change the following parts of your jobscript:

  1. add "#SBATCH --ntasks=16" (We've used 16 cores for the test file, you might lower or higher the amount to match the needs of your actual job) to your jobscript file.
  2. change the "MODEL"-variable in the jobscript file. For this example you have to change the variable to "silole_rad_zora_epr_pbe0".
  3. add the following lines of code to your jobscript:
SETNPROCS=`echo "%pal nprocs $NSLOTS"`
OPAL=`grep %pal $MODEL.inp`
sed -i "/^%pal/c$SETNPROCS" $MODEL.inp
NPAL=`grep %pal $MODEL.inp`
echo "changed $OPAL to $NPAL in $MODEL.inp" 
These lines of code will change the number of available slots to the actual amount specified abouve (at "--ntasks=XX") in your input file.

After applying these changes your jobscript should look like this:

#!/bin/bash

#SBATCH --partition=carl.p
#SBATCH --time=0-05:00:00
#SBATCH --ntasks=32
#SBATCH --mem=4G
#SBATCH --job-name ORCA-PARALLEL-TEST
#SBATCH --output=orca-parallel-test-%j.out
#SBATCH --error=orca-parallel-test-%j.err

module load ORCA/3.0.3

MODEL=silole_rad_zora_epr_pbe0

ORCAEXE=`which orca`
INPUTEXT="inp xyz"
$ORCAEXE ${MODEL}.inp > ${MODEL}.out

#preparing $TMPDIR for run by copying file
for ext in $INPUTEXT
do
   if [ -e $MODEL.$ext ]
   then
      echo "Copying $MODEL.$ext to TMPDIR"
      cp $MODEL.$ext $TMPDIR
   fi
done

#change to $TMPDIR for running ORCA
cd $TMPDIR

# modify inputfile to match the number of available slots
SETNPROCS=`echo "%pal nprocs $NSLOTS"`
OPAL=`grep %pal $MODEL.inp`
sed -i "/^%pal/c$SETNPROCS" $MODEL.inp
NPAL=`grep %pal $MODEL.inp`
echo "changed $OPAL to $NPAL in $MODEL.inp" 

#run ORCA
$ORCAEXE ${MODEL}.inp > $WORK/${MODEL}.out

#saving files from $TMPDIR
for ext in $OUTPUTEXT
do
   if [ -e ${MODEL}_${JOB_ID}.$ext ]
   then
      echo "Copying $MODEL.$ext to $WORK"
      cp ${MODEL}_${JOB_ID}.$ext $WORK
   fi
done

# copy file machines to the name ORCA expects
if [ -e machines ]
then
    	cp machines $MODEL.nodes
fi

If you jobscript is ready, you will need to download the following file: silole_rad_zora_epr_pbe0.inp (You have to unzip this file before you can use it). This will be your input file. Place in the same folder as your jobscript.

You can now submit your job with the command

sbatch orca-parallel-test.job (Replace this with your filename if you have changed it!)

Troubleshooting

In case of problems the following hints may help you to identify the cause:

  1. check the log files from the SLURM (you probably specified the output/error file name in your jobscript, so the filename would be <your_filename>.out or <your_filename>.err) as well as the ORCA log file (<model>.out for error messages.
  2. check the exit status of the job by using
sacct -j <job-id>
Further informations about using the command sacct can be found here.

If you need help to identify the problem you can contact the Scientific Computing. Please include the job-id in your request.

Documentation

The full documentation of the most recent version of ORCA (currently 4.0.0) can be found here (PDF viewer required).