BamTools

From HPC users
Revision as of 14:55, 28 May 2020 by Schwietzer (talk | contribs) (→‎Installed version)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Introduction

BamTools provides a small, but powerful suite of command-line utility programs for manipulating and querying BAM files for data.

Available bamtools commands are:

command description
convert Converts between BAM and a number of other formats
count Prints number of alignments in BAM file(s)
coverage Prints coverage statistics from the input BAM file
filter Filters BAM file(s) by user-specified criteria
header Prints BAM header information
index Generates index for BAM file
merge Merge multiple BAM files into single file
random Select random alignments from existing BAM file(s), intended more as a testing tool.
resolve Resolves paired-end reads (marking the IsProperPair flag as needed)
revert Removes duplicate marks and restores original base qualities
sort Sorts the BAM file according to some criteria
split Splits a BAM file on user-specified property, creating a new BAM output file for each value found
stats Prints some basic statistics from input BAM file(s)

Installed version

The currently installed versions are

On environment hpc-uniol-env
BamTools/2.4.0
On environment hpc-env/6.4
BamTools/2.4.1-intel-2018a
BamTools/2.5.1-intel-2018a
On environment hpc-env/8.3
BamTools/2.5.1-foss-2019b
BamTools/2.5.1-GCC-8.3.0

Using BamTools

If you want to find out more about BamTools on the HPC Cluster, you can use the command

module spider bamtools

This will show you basic informations e.g. a short description and the currently installed version.

To load the desired version of the module, use the command

module load BamTools/2.4.0-intel-2016b

Always remember: this command is case sensitive!

After loading the module, you can use the program with

bamtools <COMMAND> [ARGS]

Example:

If you want to display some stats of your BAM file (using "mapt.NA12156.altex.bam" from the test BAM files which can be found here), use following command

bamtools stats -in mapt.NA12156.altex.bam

This will give the following output:

**********************************************
Stats for BAM file(s): 
**********************************************

Total reads:       326652
Mapped reads:      326652	(100%)
Forward strand:    163389	(50.0193%)
Reverse strand:    163263	(49.9807%)
Failed QC:         0	(0%)
Duplicates:        0	(0%)
Paired-end reads:  285725	(87.4708%)
'Proper-pairs':    239076	(83.6735%)
Both pairs mapped: 250761	(87.7631%)
Read 1:            153257
Read 2:            132468
Singletons:        34964	(12.2369%)

Using BamTools with the HPC cluster

Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by SLURM.

For this reason, you have to create a jobscript for your tasks. This example jobscript for BamTools will disable the stats for an example input file:

#!/bin/bash
               
#SBATCH --ntasks=1                  
#SBATCH --mem=2G                  
#SBATCH --time=0-2:00  
#SBATCH --job-name BAMTOOLS-TEST              
#SBATCH --output=bamtools-test.%j.out        
#SBATCH --error=bamtools-test.%j.err          
 
module load BamTools/2.4.0-intel-2016b
bamtools stats -in mapt.NA12156.altex.bam

This will write the output of the "stats"-command to a file named like bamtools-test.JOBID.out. Possible errors would have been written to bamtools-test.JOBID.err.

Documentation

Further Information can be found on the project website.