Difference between revisions of "REPET 2016"

From HPC users
Jump to navigationJump to search
 
(3 intermediate revisions by the same user not shown)
Line 31: Line 31:
Of course, you need to load REPET as described above to continue this walkthrough.
Of course, you need to load REPET as described above to continue this walkthrough.


Let's assume, you need to work with TEannot and TEdenovo. In this case, you need the corresponding config files.
Let's assume you want to work with TEannot and TEdenovo. In this case, you need to copy and modify the corresponding config files. <br/>
To get an overview about every config file available, type in ''ls $REPET_PATH/config''
To get an overview about every config file available, type in ''ls $REPET_PATH/config''<br/>
But in our current case, we will need three files in our project folder:
In our current case, we will need three files in our project folder:
  cp $REPET_PATH/config/TEdenovo.cfg .
  cp $REPET_PATH/config/TEdenovo.cfg .
  cp $REPET_PATH/config/TEannot.cfg .
  cp $REPET_PATH/config/TEannot.cfg .
  cp REPET_PATH/config/setEnv.sh
  cp REPET_PATH/config/setEnv.sh


This three files need some entries regarding your SQL server and regarding your project.
This three files need some entries regarding your SQL server and regarding your project. <br/>
For ''setEnv.sh'', you need to fill out every line like this:
For ''setEnv.sh'', you need to fill out the lines like this:
vim setEnv.sh
-------------
  export REPET_HOST="db.repetdb.uni-oldenburg.de"
  export REPET_HOST="db.repetdb.uni-oldenburg.de"
  export REPET_USER="repetdb"
  export REPET_USER='repetdb'
  export REPET_PW="_20REPET19!_"
  export REPET_PW='_20REPET19!_'
  export REPET_DB="repetdb"
  export REPET_DB='repetdb'
  export REPET_PORT="3306"
  export REPET_PORT='3306'
  # Your entries could differ, depending on your work group.
  # Your entries could differ, depending on your work group. <br/>
# Also, please note that the single quotes must be preferred to double quoutes because otherwise the special characters like ''!'' could get wrongly interpreted by the shell.


As you can see, there is one database for all users.
As you can see, there is one database for all users.
Line 51: Line 54:


In the same manner, you have to adapt the config files:
In the same manner, you have to adapt the config files:
vim TEdenovo.cfg
------------
  [repet_env]
  [repet_env]
  repet_version: 2.5
  repet_version: 2.5
Line 63: Line 68:
  project_name: test_project
  project_name: test_project
  project_dir: ~/test_project/TEdenovo
  project_dir: ~/test_project/TEdenovo
Please note that the main difference between the .cfg files and the setEnv.sh are the double quotes.
To set the needed environment variables, setEnv.sh needs to be sourced:
source setEnv.sh
Now you can start computing. What you can do with REPET, you can learn from [https://biosphere.france-bioinformatique.fr/wikia2/index.php/The_REPET_package here] and [https://biosphere.france-bioinformatique.fr/wikia2/index.php/REPET_practical_course_urgi here].
But to give you a small example on how the syntax works:
TEdenovo.py -P test_project -C TEdenovo.cfg -S 1 -m Map >& step1.txt
This starts the first of eight calculating steps and and creates a project sub folder (in this case ''sample_folder_db'') as well as a log file ''step1.txt''.


== Documentation ==
== Documentation ==


You can read [https://biosphere.france-bioinformatique.fr/wikia2/index.php/REPET_practical_course_urgi here] how to start working with REPET.
You can read [https://biosphere.france-bioinformatique.fr/wikia2/index.php/REPET_practical_course_urgi here] how to start working with REPET.
A very detailed introduction is given [https://urgi.versailles.inra.fr/Tools/REPET/TEdenovo-tuto here]

Latest revision as of 11:06, 3 July 2019

Introduction

"The REPET package ( Flutre T. et al, 2011 , Quesneville H. et al. 2005 ) integrates bioinformatics programs in order to tackle biological issues at the genomic scale."¹

Installed version

The currently installed version is available on environment hpc-env/6.4:

REPET/2.5-foss-2017b-Python-2.7.14

Using REPET

If you want to find out more about REPET on the HPC Cluster, you can use the command

module spider REPET

This will show you basic informations e.g. a short description and the currently installed version.

To load the desired version of the module, use the command, e.g.

module load hpc-env 6.4
module load REPET

Always remember: this command is case sensitive!


Setting up REPET for a project

First of all, you need a project name. The project name must have the same name as the input fasta file.

mkdir test_project
cd test_project

Of course, you need to load REPET as described above to continue this walkthrough.

Let's assume you want to work with TEannot and TEdenovo. In this case, you need to copy and modify the corresponding config files.
To get an overview about every config file available, type in ls $REPET_PATH/config
In our current case, we will need three files in our project folder:

cp $REPET_PATH/config/TEdenovo.cfg .
cp $REPET_PATH/config/TEannot.cfg .
cp REPET_PATH/config/setEnv.sh

This three files need some entries regarding your SQL server and regarding your project.
For setEnv.sh, you need to fill out the lines like this:

vim setEnv.sh
-------------
export REPET_HOST="db.repetdb.uni-oldenburg.de"
export REPET_USER='repetdb'
export REPET_PW='_20REPET19!_'
export REPET_DB='repetdb'
export REPET_PORT='3306'
# Your entries could differ, depending on your work group. 
# Also, please note that the single quotes must be preferred to double quoutes because otherwise the special characters like ! could get wrongly interpreted by the shell.

As you can see, there is one database for all users. Should you need an own database for your work group (e.g. for privacy reasons), you can request an own SQL database via our SelfService Desk.

In the same manner, you have to adapt the config files:

vim TEdenovo.cfg
------------
[repet_env]
repet_version: 2.5
repet_host: repetdb.uni-oldenburg.de
repet_user: repetdb
repet_pw: _20REPET19!_
repet_db: repetdb
repet_port: 3306
repet_job_manager: slurm

[project]
project_name: test_project
project_dir: ~/test_project/TEdenovo

Please note that the main difference between the .cfg files and the setEnv.sh are the double quotes.

To set the needed environment variables, setEnv.sh needs to be sourced:

source setEnv.sh


Now you can start computing. What you can do with REPET, you can learn from here and here.

But to give you a small example on how the syntax works:

TEdenovo.py -P test_project -C TEdenovo.cfg -S 1 -m Map >& step1.txt

This starts the first of eight calculating steps and and creates a project sub folder (in this case sample_folder_db) as well as a log file step1.txt.

Documentation

You can read here how to start working with REPET. A very detailed introduction is given here