Difference between revisions of "BamTools"
Schwietzer (talk | contribs) |
|||
(28 intermediate revisions by 2 users not shown) | |||
Line 4: | Line 4: | ||
Available bamtools commands are: | Available bamtools commands are: | ||
{| class="wikitable" | {| class="wikitable" | ||
Line 30: | Line 15: | ||
| Prints number of alignments in BAM file(s) | | Prints number of alignments in BAM file(s) | ||
|- | |- | ||
| | | coverage | ||
| | | Prints coverage statistics from the input BAM file | ||
|- | |||
| filter | |||
| Filters BAM file(s) by user-specified criteria | |||
|- | |- | ||
| | | header | ||
| | | Prints BAM header information | ||
|- | |- | ||
| | | index | ||
| | | Generates index for BAM file | ||
|- | |- | ||
| | | merge | ||
| | | Merge multiple BAM files into single file | ||
|- | |- | ||
| | | random | ||
| | | Select random alignments from existing BAM file(s), intended more as a testing tool. | ||
|- | |- | ||
| | | resolve | ||
| | | Resolves paired-end reads (marking the IsProperPair flag as needed) | ||
|- | |- | ||
| | | revert | ||
| | | Removes duplicate marks and restores original base qualities | ||
|- | |- | ||
| | | sort | ||
| | | Sorts the BAM file according to some criteria | ||
|- | |- | ||
| | | split | ||
| | | Splits a BAM file on user-specified property, creating a new BAM output file for each value found | ||
|- | |- | ||
| | | stats | ||
| | | Prints some basic statistics from input BAM file(s) | ||
|} | |} | ||
== Installed version == | == Installed version == | ||
The currently installed | The currently installed versions are | ||
On environment ''hpc-uniol-env'' | |||
'''BamTools/2.4.0''' | |||
On environment ''hpc-env/6.4'' | |||
'''BamTools/2.4.1-intel-2018a''' | |||
'''BamTools/2.5.1-intel-2018a''' | |||
On environment ''hpc-env/8.3'' | |||
'''BamTools/2.5.1-foss-2019b''' | |||
'''BamTools/2.5.1-GCC-8.3.0''' | |||
== Using BamTools | == Using BamTools == | ||
If you want to find out more about | If you want to find out more about BamTools on the HPC Cluster, you can use the command | ||
module spider bamtools | module spider bamtools | ||
Line 82: | Line 80: | ||
bamtools <COMMAND> [ARGS] | bamtools <COMMAND> [ARGS] | ||
'''Example:''' | |||
If you want to display some stats of your BAM file (using "mapt.NA12156.altex.bam" from the test BAM files which can be found [https://www.ncbi.nlm.nih.gov/tools/gbench/tutorial6/ here]), use following command | |||
bamtools stats -in mapt.NA12156.altex.bam | |||
This will give the following output: | |||
********************************************** | |||
Stats for BAM file(s): | |||
********************************************** | |||
Total reads: 326652 | |||
Mapped reads: 326652 (100%) | |||
Forward strand: 163389 (50.0193%) | |||
Reverse strand: 163263 (49.9807%) | |||
Failed QC: 0 (0%) | |||
Duplicates: 0 (0%) | |||
Paired-end reads: 285725 (87.4708%) | |||
'Proper-pairs': 239076 (83.6735%) | |||
Both pairs mapped: 250761 (87.7631%) | |||
Read 1: 153257 | |||
Read 2: 132468 | |||
Singletons: 34964 (12.2369%) | |||
== Using BamTools with the HPC cluster == | |||
Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by [[SLURM Job Management (Queueing) System|SLURM]]. | |||
For this reason, you have to create a jobscript for your tasks. This example jobscript for BamTools will disable the stats for an example input file: | |||
#!/bin/bash | |||
#SBATCH --ntasks=1 | |||
#SBATCH --mem=2G | |||
#SBATCH --time=0-2:00 | |||
#SBATCH --job-name BAMTOOLS-TEST | |||
#SBATCH --output=bamtools-test.%j.out | |||
#SBATCH --error=bamtools-test.%j.err | |||
module load BamTools/2.4.0-intel-2016b | |||
bamtools stats -in mapt.NA12156.altex.bam | |||
This will write the output of the "stats"-command to a file named like ''bamtools-test.JOBID.out''. Possible errors would have been written to ''bamtools-test.JOBID.err''. | |||
== Documentation == | == Documentation == | ||
Further Information can be found on the [https://github.com/pezmaster31/bamtools project website]. | Further Information can be found on the [https://github.com/pezmaster31/bamtools project website]. |
Latest revision as of 14:55, 28 May 2020
Introduction
BamTools provides a small, but powerful suite of command-line utility programs for manipulating and querying BAM files for data.
Available bamtools commands are:
command | description |
---|---|
convert | Converts between BAM and a number of other formats |
count | Prints number of alignments in BAM file(s) |
coverage | Prints coverage statistics from the input BAM file |
filter | Filters BAM file(s) by user-specified criteria |
header | Prints BAM header information |
index | Generates index for BAM file |
merge | Merge multiple BAM files into single file |
random | Select random alignments from existing BAM file(s), intended more as a testing tool. |
resolve | Resolves paired-end reads (marking the IsProperPair flag as needed) |
revert | Removes duplicate marks and restores original base qualities |
sort | Sorts the BAM file according to some criteria |
split | Splits a BAM file on user-specified property, creating a new BAM output file for each value found |
stats | Prints some basic statistics from input BAM file(s) |
Installed version
The currently installed versions are
On environment hpc-uniol-env BamTools/2.4.0
On environment hpc-env/6.4 BamTools/2.4.1-intel-2018a BamTools/2.5.1-intel-2018a
On environment hpc-env/8.3 BamTools/2.5.1-foss-2019b BamTools/2.5.1-GCC-8.3.0
Using BamTools
If you want to find out more about BamTools on the HPC Cluster, you can use the command
module spider bamtools
This will show you basic informations e.g. a short description and the currently installed version.
To load the desired version of the module, use the command
module load BamTools/2.4.0-intel-2016b
Always remember: this command is case sensitive!
After loading the module, you can use the program with
bamtools <COMMAND> [ARGS]
Example:
If you want to display some stats of your BAM file (using "mapt.NA12156.altex.bam" from the test BAM files which can be found here), use following command
bamtools stats -in mapt.NA12156.altex.bam
This will give the following output:
********************************************** Stats for BAM file(s): ********************************************** Total reads: 326652 Mapped reads: 326652 (100%) Forward strand: 163389 (50.0193%) Reverse strand: 163263 (49.9807%) Failed QC: 0 (0%) Duplicates: 0 (0%) Paired-end reads: 285725 (87.4708%) 'Proper-pairs': 239076 (83.6735%) Both pairs mapped: 250761 (87.7631%) Read 1: 153257 Read 2: 132468 Singletons: 34964 (12.2369%)
Using BamTools with the HPC cluster
Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by SLURM.
For this reason, you have to create a jobscript for your tasks. This example jobscript for BamTools will disable the stats for an example input file:
#!/bin/bash #SBATCH --ntasks=1 #SBATCH --mem=2G #SBATCH --time=0-2:00 #SBATCH --job-name BAMTOOLS-TEST #SBATCH --output=bamtools-test.%j.out #SBATCH --error=bamtools-test.%j.err module load BamTools/2.4.0-intel-2016b bamtools stats -in mapt.NA12156.altex.bam
This will write the output of the "stats"-command to a file named like bamtools-test.JOBID.out. Possible errors would have been written to bamtools-test.JOBID.err.
Documentation
Further Information can be found on the project website.