SAMtools

From HPC users
Jump to navigationJump to search

Introduction

Samtools is a suite of programs for interacting with high-throughput sequencing data. It consists of three separate repositories:

  • Samtools - Reading/writing/editing/indexing/viewing SAM/BAM/CRAM format
  • BCFtools - Reading/writing BCF2/VCF/gVCF files and calling/filtering/summarising SNP and short indel sequence variants
  • HTSlib - A C library for reading/writing high-throughput sequencing data

Samtools and BCFtools both use HTSlib internally, but these source packages contain their own copies of htslib so they can be built independently.

Installed Version

The currently installed versions are:

On environment hpc-uniol-env
SAMtools/0.1.16
SAMtools/1.3.1
SAMtools/1.3.1-intel-2016b
SAMtools/1.7-foss-2016b
SAMtools/1.8-intel-2016b
On environment hpc-env/6.4 
SAMtools/1.7-foss-2017b
SAMtools/1.8-intel-2018a
SAMtools/1.9-foss-2017b
On environment hpc-env/8.3 
SAMtools/0.1.19-foss-2019b
SAMtools/1.9-foss-2019b

Using SAMtools on the HPC cluster

If you want to use the newest SAMtools on the HPC cluster, you will have to load the most current environment as well as the module first. You can do that with the command

module load hpc-env/8.3    # as the time of writing (June 2021) this is the most up to date environment
module load SAMtools       # Without a specified version, this loads the default version, which mostly is just the most current installed module version

Since there are multiple version of SAMtools installed, you can specify your needed version by changing the last part of the "module load"-command, e.g.

module load SAMtools/0.1.19-foss-2019b

would load version 0.1.19 instead of the most current version.

Documentation

The full documentation can be found here.