REPET 2016

From HPC users
Jump to navigationJump to search

Introduction

"The REPET package ( Flutre T. et al, 2011 , Quesneville H. et al. 2005 ) integrates bioinformatics programs in order to tackle biological issues at the genomic scale."¹

Installed version

The currently installed version is available on environment hpc-env/6.4:

REPET/2.5-foss-2017b-Python-2.7.14

Using REPET

If you want to find out more about REPET on the HPC Cluster, you can use the command

module spider REPET

This will show you basic informations e.g. a short description and the currently installed version.

To load the desired version of the module, use the command, e.g.

module load hpc-env 6.4
module load REPET

Always remember: this command is case sensitive!


Setting up REPET for a project

First of all, you need a project name. The project name must have the same name as the input fasta file.

mkdir test_project
cd test_project

Of course, you need to load REPET as described above to continue this walkthrough.

Let's assume you want to work with TEannot and TEdenovo. In this case, you need to copy and modify the corresponding config files.
To get an overview about every config file available, type in ls $REPET_PATH/config
In our current case, we will need three files in our project folder:

cp $REPET_PATH/config/TEdenovo.cfg .
cp $REPET_PATH/config/TEannot.cfg .
cp REPET_PATH/config/setEnv.sh

This three files need some entries regarding your SQL server and regarding your project.
For setEnv.sh, you need to fill out the lines like this:

vim setEnv.sh
-------------
export REPET_HOST="db.repetdb.uni-oldenburg.de"
export REPET_USER='repetdb'
export REPET_PW='_20REPET19!_'
export REPET_DB='repetdb'
export REPET_PORT='3306'
# Your entries could differ, depending on your work group. 
# Also, please note that the single quotes must be preferred to double quoutes because otherwise the special characters like ! could get wrongly interpreted by the shell.

As you can see, there is one database for all users. Should you need an own database for your work group (e.g. for privacy reasons), you can request an own SQL database via our SelfService Desk.

In the same manner, you have to adapt the config files:

vim TEdenovo.cfg
------------
[repet_env]
repet_version: 2.5
repet_host: repetdb.uni-oldenburg.de
repet_user: repetdb
repet_pw: _20REPET19!_
repet_db: repetdb
repet_port: 3306
repet_job_manager: slurm

[project]
project_name: test_project
project_dir: ~/test_project/TEdenovo

Please note that the main difference between the .cfg files and the setEnv.sh are the double quotes.

To set the needed environment variables, setEnv.sh needs to be sourced:

source setEnv.sh


Now you can start computing. What you can do with REPET, you can learn from here and here.

But to give you a small example on how the syntax works:

TEdenovo.py -P test_project -C TEdenovo.cfg -S 1 -m Map >& step1.txt

This starts the first of eight calculating steps and and creates a project sub folder (in this case sample_folder_db) as well as a log file step1.txt.

Documentation

You can read here how to start working with REPET. A very detailed introduction is given here