Difference between revisions of "OrthoFinder 2016"
Schwietzer (talk | contribs) (Created page with "== Introduction == OrthoFinder is a fast, accurate and comprehensive platform for comparative genomics. It finds orthogroups and orthologs, infers rooted gene trees for all or...") |
Schwietzer (talk | contribs) |
||
Line 84: | Line 84: | ||
<tt> </tt> | <tt> </tt> | ||
Latest revision as of 13:02, 18 June 2021
Introduction
OrthoFinder is a fast, accurate and comprehensive platform for comparative genomics. It finds orthogroups and orthologs, infers rooted gene trees for all orthogroups and identifies all of the gene duplication events in those gene trees. It also infers a rooted species tree for the species being analysed and maps the gene duplication events from the gene trees to branches in the species tree. OrthoFinder also provides comprehensive statistics for comparative genomic analyses. OrthoFinder is simple to use and all you need to run it is a set of protein sequence files (one per species) in FASTA format. 1
Installed version(s)
The following versions are installed and currently available...
... on environment hpc-env/8.3:
- OrthoFinder/2.5.2-foss-2019b
Loading / Using OrthoFinder
To load the desired version of the module, use the module load command, e.g.
module load hpc-env/8.3 module load OrthoFinder
Always remember: this command is case sensitive!
To find out on how to use OrthoFinder, you can just type in orthofinder -h to print out a help text to get you started:
$ orthofinder -h
OrthoFinder version 2.5.2 Copyright (C) 2014 David Emms SIMPLE USAGE: Run full OrthoFinder analysis on FASTA format proteomes in <dir> orthofinder [options] -f <dir> Add new species in <dir1> to previous run in <dir2> and run new analysis orthofinder [options] -f <dir1> -b <dir2> OPTIONS: -t <int> Number of parallel sequence search threads [Default = 24] -a <int> Number of parallel analysis threads -d Input is DNA sequences -M <txt> Method for gene tree inference. Options 'dendroblast' & 'msa' [Default = dendroblast] -S <txt> Sequence search program [Default = diamond] Options: blast, diamond, diamond_ultra_sens, blast_gz, mmseqs, blast_nucl -A <txt> MSA program, requires '-M msa' [Default = mafft] Options: mafft, muscle -T <txt> Tree inference method, requires '-M msa' [Default = fasttree] Options: fasttree, raxml, raxml-ng, iqtree -s <file> User-specified rooted species tree -I <int> MCL inflation parameter [Default = 1.5] -x <file> Info for outputting results in OrthoXML format -p <dir> Write the temporary pickle files to <dir> -1 Only perform one-way sequence search -X Don't add species names to sequence IDs -y Split paralogous clades below root of a HOG into separate HOGs -z Don't trim MSAs (columns>=90% gap, min. alignment length 500) -n <txt> Name to append to the results directory -o <txt> Non-default results directory -h Print this help text WORKFLOW STOPPING OPTIONS: -op Stop after preparing input files for BLAST -og Stop after inferring orthogroups -os Stop after writing sequence files for orthogroups (requires '-M msa') -oa Stop after inferring alignments for orthogroups (requires '-M msa') -ot Stop after inferring gene trees for orthogroups WORKFLOW RESTART COMMANDS: -b <dir> Start OrthoFinder from pre-computed BLAST results in <dir> -fg <dir> Start OrthoFinder from pre-computed orthogroups in <dir> -ft <dir> Start OrthoFinder from pre-computed gene trees in <dir> LICENSE: Distributed under the GNU General Public License (GPLv3). See License.md CITATION: When publishing work that uses OrthoFinder please cite: Emms D.M. & Kelly S. (2019), Genome Biology 20:238 If you use the species tree in your work then please also cite: Emms D.M. & Kelly S. (2017), MBE 34(12): 3267-3278 Emms D.M. & Kelly S. (2018), bioRxiv https://doi.org/10.1101/267914
Documentation
The full documentation can be found here.