Difference between revisions of "BamTools"

From HPC users
Jump to navigationJump to search
 
(35 intermediate revisions by 2 users not shown)
Line 2: Line 2:


BamTools provides a small, but powerful suite of command-line utility programs for manipulating and querying BAM files for data.
BamTools provides a small, but powerful suite of command-line utility programs for manipulating and querying BAM files for data.
Available bamtools commands are:
{| class="wikitable"
! command
! description
|-
| convert
| Converts between BAM and a number of other formats
|-
| count
| Prints number of alignments in BAM file(s)
|-
| coverage
| Prints coverage statistics from the input BAM file
|-
| filter
| Filters BAM file(s) by user-specified criteria
|-
| header
| Prints BAM header information
|-
| index
| Generates index for BAM file
|-
| merge
| Merge multiple BAM files into single file
|-
| random
| Select random alignments from existing BAM file(s), intended more as a testing tool.
|-
| resolve
| Resolves paired-end reads (marking the IsProperPair flag as needed)
|-
| revert
| Removes duplicate marks and restores original base qualities
|-
| sort
| Sorts the BAM file according to some criteria
|-
| split
| Splits a BAM file on user-specified property, creating a new BAM output file for each value found
|-
| stats
| Prints some basic statistics from input BAM file(s)
|}


== Installed version ==
== Installed version ==


The currently installed version is 2.4.0.
The currently installed versions are
On environment ''hpc-uniol-env''
'''BamTools/2.4.0'''
 
On environment ''hpc-env/6.4''
'''BamTools/2.4.1-intel-2018a'''
'''BamTools/2.5.1-intel-2018a'''


== Using BEDtools with the HPC Cluster ==
On environment ''hpc-env/8.3''
'''BamTools/2.5.1-foss-2019b'''
'''BamTools/2.5.1-GCC-8.3.0'''


If you want to find out more about BEDtools on the HPC Cluster, you can use the command
== Using BamTools ==


  module spider bedtools
If you want to find out more about BamTools on the HPC Cluster, you can use the command
 
  module spider bamtools


This will show you basic informations e.g. a short description and the currently installed version.
This will show you basic informations e.g. a short description and the currently installed version.
Line 17: Line 73:
To load the desired version of the module, use the command
To load the desired version of the module, use the command


  module load BEDTools/2.26.0-intel-2016b  
  module load BamTools/2.4.0-intel-2016b


Always remember: this command is case sensitive!
Always remember: this command is case sensitive!
Line 23: Line 79:
After loading the module, you can use the program with
After loading the module, you can use the program with


  bedtools <subcommand> [options]
  bamtools <COMMAND> [ARGS]
 
'''Example:'''
 
If you want to display some stats of your BAM file (using "mapt.NA12156.altex.bam" from the test BAM files which can be found [https://www.ncbi.nlm.nih.gov/tools/gbench/tutorial6/ here]), use following command
 
bamtools stats -in mapt.NA12156.altex.bam
 
This will give the following output:
 
**********************************************
Stats for BAM file(s):
**********************************************
Total reads:      326652
Mapped reads:      326652 (100%)
Forward strand:    163389 (50.0193%)
Reverse strand:    163263 (49.9807%)
Failed QC:        0 (0%)
Duplicates:        0 (0%)
Paired-end reads:  285725 (87.4708%)
'Proper-pairs':    239076 (83.6735%)
Both pairs mapped: 250761 (87.7631%)
Read 1:            153257
Read 2:            132468
Singletons:        34964 (12.2369%)
 
== Using BamTools with the HPC cluster ==
 
Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by [[SLURM Job Management (Queueing) System|SLURM]].
 
For this reason, you have to create a jobscript for your tasks. This example jobscript for BamTools will disable the stats for an example input file:
 
#!/bin/bash
               
#SBATCH --ntasks=1                 
#SBATCH --mem=2G                 
#SBATCH --time=0-2:00 
#SBATCH --job-name BAMTOOLS-TEST             
#SBATCH --output=bamtools-test.%j.out       
#SBATCH --error=bamtools-test.%j.err         
 
module load BamTools/2.4.0-intel-2016b
bamtools stats -in mapt.NA12156.altex.bam
 
This will write the output of the "stats"-command to a file named like ''bamtools-test.JOBID.out''. Possible errors would have been written to ''bamtools-test.JOBID.err''.


== Documentation ==
== Documentation ==


Further Information can be found on the [https://github.com/pezmaster31/bamtools project website].
Further Information can be found on the [https://github.com/pezmaster31/bamtools project website].

Latest revision as of 14:55, 28 May 2020

Introduction

BamTools provides a small, but powerful suite of command-line utility programs for manipulating and querying BAM files for data.

Available bamtools commands are:

command description
convert Converts between BAM and a number of other formats
count Prints number of alignments in BAM file(s)
coverage Prints coverage statistics from the input BAM file
filter Filters BAM file(s) by user-specified criteria
header Prints BAM header information
index Generates index for BAM file
merge Merge multiple BAM files into single file
random Select random alignments from existing BAM file(s), intended more as a testing tool.
resolve Resolves paired-end reads (marking the IsProperPair flag as needed)
revert Removes duplicate marks and restores original base qualities
sort Sorts the BAM file according to some criteria
split Splits a BAM file on user-specified property, creating a new BAM output file for each value found
stats Prints some basic statistics from input BAM file(s)

Installed version

The currently installed versions are

On environment hpc-uniol-env
BamTools/2.4.0
On environment hpc-env/6.4
BamTools/2.4.1-intel-2018a
BamTools/2.5.1-intel-2018a
On environment hpc-env/8.3
BamTools/2.5.1-foss-2019b
BamTools/2.5.1-GCC-8.3.0

Using BamTools

If you want to find out more about BamTools on the HPC Cluster, you can use the command

module spider bamtools

This will show you basic informations e.g. a short description and the currently installed version.

To load the desired version of the module, use the command

module load BamTools/2.4.0-intel-2016b

Always remember: this command is case sensitive!

After loading the module, you can use the program with

bamtools <COMMAND> [ARGS]

Example:

If you want to display some stats of your BAM file (using "mapt.NA12156.altex.bam" from the test BAM files which can be found here), use following command

bamtools stats -in mapt.NA12156.altex.bam

This will give the following output:

**********************************************
Stats for BAM file(s): 
**********************************************

Total reads:       326652
Mapped reads:      326652	(100%)
Forward strand:    163389	(50.0193%)
Reverse strand:    163263	(49.9807%)
Failed QC:         0	(0%)
Duplicates:        0	(0%)
Paired-end reads:  285725	(87.4708%)
'Proper-pairs':    239076	(83.6735%)
Both pairs mapped: 250761	(87.7631%)
Read 1:            153257
Read 2:            132468
Singletons:        34964	(12.2369%)

Using BamTools with the HPC cluster

Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by SLURM.

For this reason, you have to create a jobscript for your tasks. This example jobscript for BamTools will disable the stats for an example input file:

#!/bin/bash
               
#SBATCH --ntasks=1                  
#SBATCH --mem=2G                  
#SBATCH --time=0-2:00  
#SBATCH --job-name BAMTOOLS-TEST              
#SBATCH --output=bamtools-test.%j.out        
#SBATCH --error=bamtools-test.%j.err          
 
module load BamTools/2.4.0-intel-2016b
bamtools stats -in mapt.NA12156.altex.bam

This will write the output of the "stats"-command to a file named like bamtools-test.JOBID.out. Possible errors would have been written to bamtools-test.JOBID.err.

Documentation

Further Information can be found on the project website.