OMA 2016

From HPC users
Revision as of 12:45, 8 September 2021 by Schwietzer (talk | contribs) (→‎Loading / Using OMA)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Introduction

The OMA (Orthologous MAtrix) database is a well-established resource for identifying orthologs among publicly available complete genomes. Orthologs are genes that are related through speciation events, and are essential for many analyses, including gene function prediction and species tree reconstruction.

OMA standalone is a standalone package that can infer orthologs using the OMA algorithm on custom genomes. It is also possible to export genomes and their homology relations directly from the OMA web-browser and combine them with custom genomes or proteomes.

OMA standalone computes pairwise orthologs and constructs from those two different types of groupings, the OMA Groups and Hierarchical Orthologous Groups (HOGs). Furthermore, OMA standalone can predict gene function annotations using Gene Ontology terms based on existing annotations from exported genomes, and produces phyletic profiles for OMA Groups and HOGs. See section Possible applications for some further explanations how to use the output of OMA standalone. 1

Installed version(s)

The following versions are installed and currently available...

... on environment hpc-env/8.3:

  • OMA/2.5.0-GCCcore-8.3.0

Loading / Using OMA

To load the desired version of the module, use the module load command, e.g.

module load hpc-env/8.3
module load OMA

Always remember: this command is case sensitive!


To find out on how to use OMA you can just type in OMA --help to print out a help text to get you started:

$ OMA  -h
/cm/shared/uniol/software/8.3/OMA/2.5.0-GCCcore-8.3.0/OMA/bin/OMA - runs OMA standalone

/cm/shared/uniol/software/8.3/OMA/2.5.0-GCCcore-8.3.0/OMA/bin/OMA [options] [paramfile]

Runs the standalone version of the Orthologous MAtrix (OMA) pipeline
to infer orthologs among complete genomes. A highlevel description
of its algorithm is available here: http://omabrowser.org/oma/about

The all-against-all Smith-Waterman alignment step of OMA requires 
a lot of CPU time. OMA standalone can therefore be run in parallel.
If you intend to use OMA standalone on a HPC cluster with a scheduler
such as LSF, PBS Pro, Slurm or SunGridEngine, you should use the 
jobarray option of those systems,
e.g. bsub -J "oma[1-500]" /cm/shared/uniol/software/8.3/OMA/2.5.0-GCCcore-8.3.0/OMA/bin/OMA (on LSF).
     qsub -t 1-500 /cm/shared/uniol/software/8.3/OMA/2.5.0-GCCcore-8.3.0/OMA/bin/OMA (on SunGridEngine)
In case you run OMA on a single computer with several cores, use 
the -n option.

Options:
  -n <number>   number of parallel jobs to be started on this computer
  -v            version
  -d <level>    increase debug info to <level>. By default level is set to 1.
  -i            interactive session, do not quit in case of error and at the end
                of the run.
  -s            stop after the AllAll phase. This is the part which is parallelized.
                The option can be useful on big datasets that require lot of 
                memory for the later phases of OMA. It allows to stop after the 
                parallelized step and restart again a single process with more
                memory.
  -c            stop after database conversion. This option is useful if you 
                work with a large dataset and/or the filesystem you use is 
                slow. 
  -W <secs>     maximum amount of Wall-clock time (in secs) that the job should
                run before terminating in a clean way. This option has only an 
                effect in the all-against-all phase. If the job terminates
                because it reaches the time limit, it quits with the exit
                code 99.
  -p            copy the default parameter file to the current directory. This
                is useful if want to analyse a new dataset and previously 
                installed OmaStandalone.
  -h/?          this help

paramfile       path to the parameter file. it defaults to ./parameters.drw


EXIT
   0            normal exit
   1            a general error (i.e. configuration problem) occured
  99            reached timelimit (provided with -W flag)

Documentation

The full documentation can be found here.