PASA 2016

From HPC users
Jump to navigationJump to search

Introduction

PASA, acronym for Program to Assemble Spliced Alignments (and pronounced 'pass-uh'), is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments. ¹

Installed version(s)

This version is installed and currently available on environment hpc-env/6.4:

PASA/2.4.1-foss-2017b


Using PASA

To load the desired version of the module, use the command, e.g.

module load hpc-env 6.4
module load PASA

Always remember: this command is case sensitive!

Before using PASA, some configuration steps need to be done in advance. You will need one of two configuration files, depending on whether you have a MySQL server to use or not.

  • MySQL server

Should you want to use PASA in combination with a MySQL server, you will need to copy the configuration file conf.txt to a destination of your choice, ideally to the current working directory from which you will start the job script.

cp $EBROOTPASA/pasa_conf/conf.txt /your/working/directory

In this file, you have to change the relevant information, like MYSQLSERVER, 'MYSQL_RW_USER and MYSQL_RW_PASSWORD


  • SQLite

Should you need to work without an SQL server, you can do so by using SQLite. in this case, you have to copy the configuration file alignAssembly.config to a directory of your choice, preferrably the working directory from which you will start the job script.

cp $EBROOTPASA/pasa_conf/alignAssembly.config /your/working/directory

In this file, you will have to add the path for the SQLite file, since the standard directory /tmp/ can get problematic. The line containing the value for DATABASE needs to be changed. You can set the path to wherever you want. The only mandatory thing to consider is the fact, that you can only set an absoulte path. Wich means, that this could be a valid entry, if you want to store the SQLite file directly at your home directory:

# If the environment variable DSN_DRIVER=mysql then it is the name of a MySQL database
DATABASE=/user/abcd1234/sample_mydb_pasa.sqlite    # where abcd1234 is your user account

What you can't do, is using environment variables or tokens like '~':

DATABASE=~/sample_mydb_pasa.sqlite         won't work!
DATABASE=$HOME/sample_mydb_pasa.sqlite     won't work!


Now, you can start the alignment, e.g. like shown in the PASA wiki:

     %  $PASAHOME/Launch_PASA_pipeline.pl \
           -c alignAssembly.config -C -R -g genome_sample.fasta \
           -t all_transcripts.fasta.clean -T -u all_transcripts.fasta \
           -f FL_accs.txt --ALIGNERS blat,gmap --CPU 2
 

Please note, that you would need to point to the whole path to the config scripf, if it is trored outside of your working directory.

Documentation

The full documentation can be found here.