Difference between revisions of "BLAT"

From HPC users
Jump to navigationJump to search
Line 17: Line 17:


The currently installed version is 3.5.
The currently installed version is 3.5.
== Using BLAST with the HPC Cluster ==
If you want to find out more about BLAST on the HPC Cluster, you can use the command
module spider blast
This will show you basic informations e.g. a short description and the currently installed version.
To load the desired module, use the command, e.g.
module load BLAST/2.6.0-Linux_x86_64
Always remember: this command is case sensitive!


== Documentation ==
== Documentation ==


The full documentation can be found [http://genome.ucsc.edu/FAQ/FAQblat.html here].
The full documentation can be found [http://genome.ucsc.edu/FAQ/FAQblat.html here].

Revision as of 10:21, 17 January 2017

Introduction

Like BLAST, Blat is an alignment tool, but it is structured differently. On DNA, Blat works by keeping an index of an entire genome in memory. Thus, the target database of BLAT is not a set of GenBank sequences, but instead an index derived from the assembly of the entire genome. By default, the index consists of all non-overlapping 11-mers except for those heavily involved in repeats, and it uses less than a gigabyte of RAM. This smaller size means that Blat is far more easily mirrored than BLAST. Blat of DNA is designed to quickly find sequences of 95% and greater similarity of length 40 bases or more. It may miss more divergent or shorter sequence alignments. (The default settings and expected behavior of standalone Blat are slightly different from those on the graphical version of Blat.)

On proteins, Blat uses 4-mers rather than 11-mers, finding protein sequences of 80% and greater similarity to the query of length 20+ amino acids. The protein index requires slightly more than 2 gigabytes of RAM. In practice -- due to sequence divergence rates over evolutionary time -- DNA Blat works well within humans and primates, while protein Blat continues to find good matches within terrestrial vertebrates and even earlier organisms for conserved proteins. Within humans, protein Blat gives a much better picture of gene families (paralogs) than DNA Blat. However, BLAST and psi-BLAST at NCBI can find much more remote matches.

From a practical standpoint, Blat has several advantages over BLAST:

  • speed (no queues, response in seconds) at the price of lesser homology depth
  • the ability to submit a long list of simultaneous queries in fasta format
  • five convenient output sort options
  • a direct link into the UCSC browser
  • alignment block details in natural genomic order
  • an option to launch the alignment later as part of a custom track

Installed version

The currently installed version is 3.5.

Using BLAST with the HPC Cluster

If you want to find out more about BLAST on the HPC Cluster, you can use the command

module spider blast

This will show you basic informations e.g. a short description and the currently installed version.

To load the desired module, use the command, e.g.

module load BLAST/2.6.0-Linux_x86_64 

Always remember: this command is case sensitive!

Documentation

The full documentation can be found here.