Difference between revisions of "REPET 2016"

From HPC users
Jump to navigationJump to search
Line 40: Line 40:
This three files need some entries regarding your SQL server and regarding your project.
This three files need some entries regarding your SQL server and regarding your project.
For ''setEnv.sh'', you need to fill out every line like this:
For ''setEnv.sh'', you need to fill out every line like this:
vim setEnv.sh
-------------
  export REPET_HOST="db.repetdb.uni-oldenburg.de"
  export REPET_HOST="db.repetdb.uni-oldenburg.de"
  export REPET_USER="repetdb"
  export REPET_USER="repetdb"
Line 51: Line 53:


In the same manner, you have to adapt the config files:
In the same manner, you have to adapt the config files:
vim TEdenovo.cfg
------------
  [repet_env]
  [repet_env]
  repet_version: 2.5
  repet_version: 2.5
Line 63: Line 67:
  project_name: test_project
  project_name: test_project
  project_dir: ~/test_project/TEdenovo
  project_dir: ~/test_project/TEdenovo
As you can see, the main difference between the .cfg files and the setEnv.sh are the double quotes.
To set the needed environment variables, setEnv.sh needs to be sourced:
source setEnv.sh
Now you can start computing. What you can do with REPL, you can learn [https://biosphere.france-bioinformatique.fr/wikia2/index.php/The_REPET_package here] and [https://biosphere.france-bioinformatique.fr/wikia2/index.php/REPET_practical_course_urgi here].
But to give you a small example on how the syntax works:
TEdenovo.py -P test_priject -C TEdenovo.cfg -S 1 -m Map >& step1.txt
This starts the first of eight calculating steps and and creates a project sub folder (in this case ''sample_folder_db'') as well as the log file ''step1.txt''.


== Documentation ==
== Documentation ==


You can read [https://biosphere.france-bioinformatique.fr/wikia2/index.php/REPET_practical_course_urgi here] how to start working with REPET.
You can read [https://biosphere.france-bioinformatique.fr/wikia2/index.php/REPET_practical_course_urgi here] how to start working with REPET.
A very detailed introduction is given [https://urgi.versailles.inra.fr/Tools/REPET/TEdenovo-tuto here]

Revision as of 11:03, 27 June 2019

Introduction

"The REPET package ( Flutre T. et al, 2011 , Quesneville H. et al. 2005 ) integrates bioinformatics programs in order to tackle biological issues at the genomic scale."¹

Installed version

The currently installed version is available on environment hpc-env/6.4:

REPET/2.5-foss-2017b-Python-2.7.14

Using REPET

If you want to find out more about REPET on the HPC Cluster, you can use the command

module spider REPET

This will show you basic informations e.g. a short description and the currently installed version.

To load the desired version of the module, use the command, e.g.

module load hpc-env 6.4
module load REPET

Always remember: this command is case sensitive!


Setting up REPET for a project

First of all, you need a project name. The project name must have the same name as the input fasta file.

mkdir test_project
cd test_project

Of course, you need to load REPET as described above to continue this walkthrough.

Let's assume, you need to work with TEannot and TEdenovo. In this case, you need the corresponding config files. To get an overview about every config file available, type in ls $REPET_PATH/config But in our current case, we will need three files in our project folder:

cp $REPET_PATH/config/TEdenovo.cfg .
cp $REPET_PATH/config/TEannot.cfg .
cp REPET_PATH/config/setEnv.sh

This three files need some entries regarding your SQL server and regarding your project. For setEnv.sh, you need to fill out every line like this:

vim setEnv.sh
-------------
export REPET_HOST="db.repetdb.uni-oldenburg.de"
export REPET_USER="repetdb"
export REPET_PW="_20REPET19!_"
export REPET_DB="repetdb"
export REPET_PORT="3306"
# Your entries could differ, depending on your work group.

As you can see, there is one database for all users. Should you need an own database for your work group (e.g. for privacy reasons), you can request an own SQL database via our SelfService Desk.

In the same manner, you have to adapt the config files:

vim TEdenovo.cfg
------------
[repet_env]
repet_version: 2.5
repet_host: repetdb.uni-oldenburg.de
repet_user: repetdb
repet_pw: _20REPET19!_
repet_db: repetdb
repet_port: 3306
repet_job_manager: slurm

[project]
project_name: test_project
project_dir: ~/test_project/TEdenovo

As you can see, the main difference between the .cfg files and the setEnv.sh are the double quotes.

To set the needed environment variables, setEnv.sh needs to be sourced:

source setEnv.sh


Now you can start computing. What you can do with REPL, you can learn here and here.

But to give you a small example on how the syntax works:

TEdenovo.py -P test_priject -C TEdenovo.cfg -S 1 -m Map >& step1.txt

This starts the first of eight calculating steps and and creates a project sub folder (in this case sample_folder_db) as well as the log file step1.txt.

Documentation

You can read here how to start working with REPET. A very detailed introduction is given here