Bam to mate hist 2016

From HPC users
Revision as of 14:10, 13 September 2018 by Schwietzer (talk | contribs) (Created page with "= bam_to_mate_hist = == Introduction == This script is intended as a simple QC method for Hi-C libraries, based on reads in a BAM file aligned to some genome/assembly. The ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

bam_to_mate_hist

Introduction

This script is intended as a simple QC method for Hi-C libraries, based on reads in a BAM file aligned to some genome/assembly.

The most informative Hi-C reads are the ones that are long-distance contacts, or contacts between contigs of an assembly. This tool quantifies such contacts and makes plots of contact distance distributions. The most successful Hi-C libraries have many long-distance and among-contig contacts.

Hi-C connectivity drops off in approximately a power-law with increasing linear sequence distance. Consequently, one expects Hi-C reads to follow a characteristic distribution, wherein there is a spike of many read pairs at distances close to zero, which drops off smoothly (in log space) with increasing distance. If there are odd spikes or discontinuities, or if there are few long-distance contacts, there may be a problem either with the library or the assembly.

Installed version

Since X is basically just a script, there is no versioning. The script is installed as a module within the Environment hpc-env/6.4 and includes all the required dependencies.

Using bam_to_mate_hist on the HPC cluster

To use the script, just change to the corresponding environment and load the module

module load hpc-env/6.4
module load  bam_to_mate_hist

Now you can easily use the script using the following statement

bam_to_mate_hist <arguments>

Documentation

You can find more documentation regarding bam_to_mate_hist and how it can be used here.