BCFtools
Introduction
BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.
Most commands accept VCF, bgzipped VCF and BCF with filetype detected automatically even when streaming from a pipe. Indexed VCF and BCF will work in all situations. Un-indexed VCF and BCF and streams will work in most, but not all situations. In general, whenever multiple VCFs are read simultaneously, they must be indexed and therefore also compressed.
BCFtools is designed to work on a stream. It regards an input file "-" as the standard input (stdin) and outputs to the standard output (stdout). Several commands can thus be combined with Unix pipes.
Installed version
The currently installed version is 1.3.1.
List of available commands
For a full list of available commands, run bcftools without arguments. For a full list of available options, run bcftools COMMAND (eg. "bcftools annotate") without arguments.
- annotate: edit VCF files, add or remove annotations
- call: SNP/indel calling (former "view")
- cnv: Copy Number Variation caller
- concat: concatenate VCF/BCF files from the same set of samples
- consensus: create consensus sequence by applying VCF variants
- convert: convert VCF/BCF to other formats and back
- csq: haplotype aware consequence caller
- filter: filter VCF/BCF files using fixed thresholds
- gtcheck: check sample concordance, detect sample swaps and contamination
- index: index VCF/BCF
- isec: intersections of VCF/BCF files
- merge: merge VCF/BCF files files from non-overlapping sample sets
- mpileup: multi-way pileup producing genotype likelihoods
- norm: normalize indels
- plugin: run user-defined plugin
- polysomy: detect contaminations and whole-chromosome aberrations
- query: transform VCF/BCF into user-defined formats
- reheader: modify VCF/BCF header, change sample names
- roh: identify runs of homo/auto-zygosity
- stats: produce VCF/BCF stats (former vcfcheck)
- view: subset, filter and convert VCF and BCF files
Using BCFtools
If you want to find out more about BCFtools on the HPC cluster, you can use the command
module spider bcftools
This will show you basic informations e.g. a short description and the currently installed version.
To load the desired version of the module, use the command
module load BCFtools/1.3.1-intel-2016b
Always remember: this command is case sensitive!
After loading the module, you can use the program with
bcftools <command> <argument>
Using BCFtools with the HPC cluster
Since there many people working with the HPC cluster, its important that everyone has an equal chance to do so. Therefore, every job should be processed by SLURM.
For this reason, you have to create a jobscript for your tasks. This example jobscript for BCFtools will disable the stats for an example input file:
#!/bin/bash #SBATCH --ntasks=1 #SBATCH --mem=2G #SBATCH --time=0-2:00 #SBATCH --job-name BCFTOOLS-TEST #SBATCH --output=bcftools-test.%j.out #SBATCH --error=bcftools-test.%j.err module load BCFtools/1.3.1-intel-2016b bcftools -l bcftools-testfile.vcf
This will output list of sites (chr pos) or regions (BED) to a file named like bcftools-test.JOBID.out. Possible errors would have been written to bcftools-test.JOBID.err.
Documentation
The full documentation can be found here.