Difference between revisions of "BamTools"

From HPC users
Jump to navigationJump to search
Line 100: Line 100:
Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by [[SLURM Job Management (Queueing) System|SLURM]].
Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by [[SLURM Job Management (Queueing) System|SLURM]].


For this reason, you have to create a jobscript. An example jobscript for an simple job with BamTools could look like this:
For this reason, you have to create a jobscript for your tasks. This example jobscript for BamTools will disable the stats for an example input file:


  #!/bin/bash
  #!/bin/bash

Revision as of 15:45, 17 January 2017

Introduction

BamTools provides a small, but powerful suite of command-line utility programs for manipulating and querying BAM files for data.

Available bamtools commands are:

command description
convert Converts between BAM and a number of other formats
count Prints number of alignments in BAM file(s)
coverage Prints coverage statistics from the input BAM file
filter Filters BAM file(s) by user-specified criteria
header Prints BAM header information
index Generates index for BAM file
merge Merge multiple BAM files into single file
random Select random alignments from existing BAM file(s), intended more as a testing tool.
resolve Resolves paired-end reads (marking the IsProperPair flag as needed)
revert Removes duplicate marks and restores original base qualities
sort Sorts the BAM file according to some criteria
split Splits a BAM file on user-specified property, creating a new BAM output file for each value found
stats Prints some basic statistics from input BAM file(s)

Installed version

The currently installed version is 2.4.0.

Using BamTools

If you want to find out more about BamTools on the HPC Cluster, you can use the command

module spider bamtools

This will show you basic informations e.g. a short description and the currently installed version.

To load the desired version of the module, use the command

module load BamTools/2.4.0-intel-2016b

Always remember: this command is case sensitive!

After loading the module, you can use the program with

bamtools <COMMAND> [ARGS]

Example:

If you want to display some stats of your BAM file (using "mapt.NA12156.altex.bam" from the test BAM files which can be found here), use following command

bamtools stats -in mapt.NA12156.altex.bam

This will give the following output:

**********************************************
Stats for BAM file(s): 
**********************************************

Total reads:       326652
Mapped reads:      326652	(100%)
Forward strand:    163389	(50.0193%)
Reverse strand:    163263	(49.9807%)
Failed QC:         0	(0%)
Duplicates:        0	(0%)
Paired-end reads:  285725	(87.4708%)
'Proper-pairs':    239076	(83.6735%)
Both pairs mapped: 250761	(87.7631%)
Read 1:            153257
Read 2:            132468
Singletons:        34964	(12.2369%)

Using BamTools with the HPC cluster

Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by SLURM.

For this reason, you have to create a jobscript for your tasks. This example jobscript for BamTools will disable the stats for an example input file:

#!/bin/bash

#SBATCH --nodes=1                    
#SBATCH --ntasks= 1                  
#SBATCH --mem=2G                  
#SBATCH --time=0-2:00  
#SBATCH --job-name BAMTOOLS-TEST              
#SBATCH --output=bamtools-test.%j.out        
#SBATCH --error=bamtools-test.%j.err          
#SBATCH --mail-type=END,FAIL         
#SBATCH --mail-user=your.name@uol.de

module load BamTools/2.4.0-intel-2016b
bamtools stats -in mapt.NA12156.altex.bam

Documentation

Further Information can be found on the project website.