GenomeThreader 2016

Introduction

GenomeThreader is a software tool to compute gene structure predictions. The gene structure predictions are calculated using a similarity-based approach where additional cDNA/EST and/or protein sequences are used to predict gene structures via spliced alignments. GenomeThreader was motivated by disabling limitations in GeneSeqer, a popular gene prediction program which is widely used for plant genome annotation. [ https://genomethreader.org ¹]

Installed version(s)

The following versions are installed and currently available...

... on environment hpc-env/8.3:

GenomeThreader/1.7.1-Linux_x86_64-64bit
GenomeThreader/1.7.3-Linux_x86_64-64bit

Loading GenomeThreader

To load the desired version of the module, use the module load command, e.g.

module load hpc-env/8.3
module load GenomeThreader

Always remember: this commands are case sensitive!

Using GenomeThreader

To find out on how to use GenomeThreader you can just type in gth --help after loading the module to print out a help text to get you started:

$ gth --help
Usage: gth [option ...] -genomic file [...] -cdna file [...] -protein file [...]
Compute similarity-based gene structure predictions (spliced alignments)
using cDNA/EST and/or protein sequences and assemble the resulting spliced
alignments to consensus spliced alignments.

-genomic          specify input files containing genomic sequences
                  mandatory option
-cdna             specify input files containing cDNA/EST sequences
-protein          specify input files containing protein sequences
-species          specify species to select splice site model which is most
                  appropriate; possible species:
                  "human"
                  "mouse"
                  "rat"
                  "chicken"
                  "drosophila"
                  "nematode"
                  "fission_yeast"
                  "aspergillus"
                  "arabidopsis"
                  "maize"
                  "rice"
                  "medicago"
                  default: undefined
-bssm             read bssm parameter from file in the path given by the
                  environment variable BSSMDIR
                  default: undefined
-scorematrix      read amino acid substitution scoring matrix from file in the
                  path given by the environment variable GTHDATADIR
                  default: BLOSUM62
-translationtable set the codon translation table used for codon translation in
                  matching, DP, and output
                  default: 1
-f                analyze only forward strand of genomic sequences
                  default: no
-r                analyze only reverse strand of genomic sequences
                  default: no
-cdnaforward      align only forward strand of cDNAs
                  default: no
-frompos          analyze genomic sequence from this position
                  requires -topos or -width; counting from 1 on
                  default: 0
-topos            analyze genomic sequence to this position
                  requires -frompos; counting from 1 on
                  default: 0
-width            analyze only this width of genomic sequence
                  requires -frompos
                  default: 0
-v                be verbose
                  default: no
-xmlout           show output in XML format
                  default: no
-gff3out          show output in GFF3 format
                  default: no
-md5ids           show MD5 fingerprints as sequence IDs
                  default: no
-o                redirect output to specified file
                  default: undefined
-gzip             write gzip compressed output file
                  default: no
-bzip2            write bzip2 compressed output file
                  default: no
-force            force writing to output file
                  default: no
-gs2out           output in old GeneSeqer2 format
                  default: no
-minmatchlen      specify minimum match length (cDNA matching)
                  default: 20
-seedlength       specify the seed length (cDNA matching)
                  default: 18
-exdrop           specify the Xdrop value for edit distance extension (cDNA
                  matching)
                  default: 2
-prminmatchlen    specify minimum match length (protein matches)
                  default: 24
-prseedlength     specify seed length (protein matching)
                  default: 10
-prhdist          specify Hamming distance (protein matching)
                  default: 4
-gcmaxgapwidth    set the maximum gap width for global chains
                  defines approximately the maximum intron length
                  set to 0 to allow for unlimited length
                  in order to avoid false-positive exons (lonely exons) at the
                  sequence ends, it is very important to set this parameter
                  appropriately!
                  default: 1000000
-gcmincoverage    set the minimum coverage of global chains regarding to the
                  reference sequence
                  default: 50
-paralogs         compute paralogous genes (different chaining procedure)
                  default: no
-introncutout     enable the intron cutout technique
                  default: no
-fastdp           use jump table to increase speed of DP calculation
                  default: no
-autointroncutout set the automatic intron cutout matrix size in megabytes and
                  enable the automatic intron cutout technique
                  default: 0
-intermediate     stop after calculation of spliced alignments and output
                  results in reusable XML format. Do not process this output
                  yourself, use the ``normal'' XML output instead!
                  default: no
-first            set the maximum number of spliced alignments per genomic DNA
                  input. Set to 0 for unlimited number.
                  default: 0
-help             display help for basic options and exit
-help+            display help for all options and exit
-version          display version information and exit

For detailed information, please refer to the manual of GenomeThreader.
Report bugs to <gordon@gremme.org>.

Documentation

The full documentation can be found here.

GenomeThreader 2016

Contents

Introduction

Installed version(s)

Loading GenomeThreader

Using GenomeThreader

Documentation

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Topics

Tools