Spaln 2016
Introduction
Spaln (space-efficient spliced alignment) is a stand-alone program that maps and aligns a set of cDNA or protein sequences onto a whole genomic sequence in a single job. Spaln also performs spliced or ordinary alignment after rapid similarity search against a protein sequence database, if a genomic segment or an amino acid sequence is given as a query. 1
Installed version(s)
The following versions are installed and currently available...
... on environment hpc-env/8.3:
- 'spaln/2.4.03-GCC-8.3.0'
- 'spaln/2.4.6-GCC-8.3.0'
Loading spaln
To load the desired version of the module, use the module load command, e.g.
module load hpc-env/8.3 module load spaln
Always remember: this commands are case sensitive!
Using spaln
To find out on how to use spaln you can just type in spaln after loading the module to print out a help text to get you started:
$ spaln *** SPALN version 2.4.6 <210910> *** Usage: spaln -W[Genome.bkn] -KD [W_Options] Genome.mfa (to write block inf.) spaln -W[Genome.bkp] -KP [W_Options] Genome.mfa (to write block inf.) spaln -W[AAdb.bka] -KA [W_Options] AAdb.faa (to write aa db inf.) spaln -W [Genome.mfa|AAdb.faa] (alternative to makdbs.) spaln [R_options] genomic_segment cDNA.fa (to align) spaln [R_options] genomic_segment protein.fa (to align) spaln [R_options] -dGenome cDNA.fa (to map & align) spaln [R_options] -dGenome protein.fa (to map & align) spaln [R_options] -aAAdb genomic_segment.fa (to search aa database & align) spaln [R_options] -aAAdb protein.fa (to search aa database) in the following, # = integer or real number; $ = string; default in () W_Options: -E Generate local lookup table for each block -XC# number of bit patterns < 6 (1) -XG# Maximum expected gene size (inferred from genome|db size) -Xk# Word size (inferred from genome|db size) -Xb# Block size (inferred from genome|db size) -Xa# Abundance factor (10) -Xr# Minimum ORF length with -KP (30)) -g gzipped output -t# Mutli-thread operation with # threads R_Options (representatives): -E Use local lookup table for each block -H# Minimum score for report (35) -L or -LS or -L# semi-global or local alignment (-L) -M#[,#2] Number of outputs per query (1) (4 if # is omitted) #2 (4) specifies the max number of candidate loci This option is effective only for map-and-align modes -O#[,#2,..] (GvsA|C) 0:Gff3_gene; 1:alignment; 2:Gff3_match; 3:Bed; 4:exon-inf; 5:intron-inf; 6:cDNA; 7:translated; 8:block-only; 10:SAM; 12:binary; 15:query+GS (4) -O#[,#2,..] (AvsA) 0:statistics; 1:alignment; 2:Sugar; 3:Psl; 4:XYL; 5:srat+XYL; 8:Cigar; 9:Vulgar; 10:SAM; (4) -Q# 0:DP; 1-3:HSP-Search; 4-7; Block-Search (3) -R$ Read block information file *.bkn, *.bkp or *.bka -S# Orientation. 0:annotation; 1:forward; 2:reverse; 3:both (3) -T$ Subdirectory where species-specific parameters reside -a$ Specify AAdb. Must run `makeidx.pl -ia' breforehand -A$ Same as -a but db sequences are stored in memory -d$ Specify genome. Must run `makeidx.pl -i[n|p]' breforehand -D$ Same as -d but db sequences are stored in memory -g gzipped output in combination with -O12 -l# Number of characters per line in alignment (60) -o$ File/directory/prefix where results are written (stdout) -pa# Remove 3' poly A >= # (0: don't remove) -pw Report results even if the score is below the threshold -pq Quiet mode -r$ Report information about block data file -u# Gap-extension penalty (3) -v# Gap-open penalty (8) -w# Band width for DP matrix scan (100) -t[#] Mutli-thread operation with # threads -ya# Stringency of splice site. 0->3:strong->weak -yl3 Ddouble affine gap penalty -ym# Nucleotide match score (2) -yn# Nucleotide mismatch score (-6) -yo# Penalty for a premature termination codon (100) -yx# Penalty for a frame shift error (100) -yy# Weight for splice site signal (8) -yz# Weight for coding potential (2) -yB# Weight for branch point signal (0) -yI$ Intron length distribution -yL# Minimum expected length of intron (30) -yS[#] Use species-specific parameter set (0.0/0.5) -yX0 Don't use parameter set for cross-species comparison -yZ# Weight for intron potential (0) -XG# Reset maximum expected gene size, suffix k or M is effective Examples: spaln -W -KP -E -t4 dictdisc_g.gf spaln -W -KA -Xk5 Swiss.faa spaln -O -LS 'chr1.fa 10001 40000' cdna.nfa spaln -Q0,1,7 -t10 -TTetrapod -XG2M -ommu/ -dmus_musc_g hspcdna.nfa spaln -Q7 -O5 -t10 -Tdictdics -ddictdisc_g [-E] 'dictdisc.faa (101 200)' > ddi.intron spaln -Q7 -O0 -t10 -Tdictdics -aSwiss 'chr1.nfa 200001 210000' > Chr1_200-210K.gff spaln -Q4 -O0 -t10 -M10 -aSwiss dictdisc.faa > dictdisc.alignment_score
Documentation
The full documentation can be found at the project page.