Angsd 2016

From HPC users
Revision as of 13:39, 18 June 2021 by Schwietzer (talk | contribs) (Created page with "== Introduction == ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed ge...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Introduction

ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities. Most methods take genotype uncertainty into account instead of basing the analysis on called genotypes. This is especially useful for low and medium depth data. The software is written in C++ and has been used on large sample sizes.

This program is not for manipulating BAM/CRAM files, but solely a tool to perform various kinds of analysis. We recommend the excellent program SAMtools for outputting and modifying bamfiles. 1

Installed version(s)

The following versions are installed and currently available...

... on environment hpc-env/8.3:

  • angsd/0.935-GCC-8.3.0

Loading / Using angsd

To load the desired version of the module, use the module load command, e.g.

module load hpc-env/8.3
module load angsd

Always remember: this command is case sensitive!


To find out on how to use angsd you can just type in angsd without any additional arguments to print out a help text to get you started:

$ angsd
	-> angsd version: 0.935 (htslib: 1.12) build(Jun 18 2021 12:23:38)
	-> angsd 
	-> No '-out' argument given, output files will be called 'angsdput'

	-> angsd version: 0.935 (htslib: 1.12) build(Jun 18 2021 12:23:39)
	-> Please use the website "http://www.popgen.dk/angsd" as reference
	-> Use -nThreads or -P for number of threads allocated to the program
Overview of methods:
	-GL		Estimate genotype likelihoods
	-doCounts	Calculate various counts statistics
	-doAsso		Perform association study
	-doMaf		Estimate allele frequencies
	-doError	Estimate the type specific error rates
	-doAncError	Estimate the errorrate based on perfect fastas
	-HWE_pval	Est inbreedning per site or use as filter
	-doGeno		Call genotypes
	-doFasta	Generate a fasta for a BAM file
	-doAbbababa	Perform an ABBA-BABA test
	-sites		Analyse specific sites (can force major/minor)
	-doSaf		Estimate the SFS and/or neutrality tests genotype calling
	-doHetPlas	Estimate hetplasmy by calculating a pooled haploid frequency

	Below are options that can be usefull
	-bam		Options relating to bam reading
	-doMajorMinor	Infer the major/minor using different approaches
	-ref/-anc	Read reference or ancestral genome
	-doSNPstat	Calculate various SNPstat
	-cigstat	Printout CIGAR stat across readlength
	many others

Output files:
	 In general the specific analysis outputs specific files, but we support basic bcf output
	-doBcf		Wrapper around -dopost -domajorminor -dofreq -gl -dovcf docounts
For information of specific options type: 
	./angsd METHODNAME eg 
		./angsd -GL
		./angsd -doMaf
		./angsd -doAsso etc
		./angsd sites for information about indexing -sites files
Examples:
	Estimate MAF for bam files in 'list'
		'./angsd -bam list -GL 2 -doMaf 2 -out RES -doMajorMinor 1'


Documentation

For more information you can visit the project page or the github page.