Discovardenovo 2016
Introduction
DISCOVAR is a new variant caller and DISCOVAR de novo a new genome assembler, both designed for state-of-the-art data. Their inputs are chosen to optimize quality while keeping costs low. Currently it takes as input Illumina reads of length 250 or longer — produced on MiSeq or HiSeq 2500 — and from a single PCR-free library. These data enable a level of completeness and continuity that was not previously possible.
DISCOVAR can call variants on a region by region basis, potentially tiling an entire large genome. DISCOVAR variant calling is under active development and transitioning to VCF.
DISCOVAR de novo can generate de novo assemblies for both large and small genomes. It currently does not call variants. 1
Installed version(s)
The following versions are installed and currently available on the environments hpc-env/8.3, hpc-env/6.4, and hpc-uniol-env:
- discovardenovo/52488
Loading / Using discovardenovo
To load the desired version of the module, use the module load command, e.g.
module load hpc-env/8.3 module load discovardenovo
Always remember: this command is case sensitive!
Discovardenovo is loadable as only Discovar or as DiscovarDeNovo:
To find out on how to use Discovar you can just type in Discovar --help to print out a help text to get you started:
Performing re-exec to adjust stack size. Usage: Discovar arg1=value1 arg2=value2 ... Required arguments: READS (String) Comma-separated list of one or more bam files, each ending in .bam. Alternatively, this may have the form @fn, where fn is a file containing a list of bam file names, one per line. REGIONS (String) Regions to be extracted from bam files: a comma-separated list of one or more region specifications chr:start-stop, where chr is a chromosome name (consistent with usage in the bam files), and start-stop defines a range of bases on chr (zero based). If REGIONS = all, bam files will be used in their entirety. TMP (String) Directory to put temporary files in. OUT_HEAD (String) Full path prefix for output files. Optional arguments: NUM_THREADS (unsigned int) default: 0 Number of threads to use (use all available processors if set to 0). REFERENCE (String) FASTA file containing reference - used for variant calling. STATUS_LOGGING (Bool) default: False if set to True, generate cryptic logging that reports on the status of intermediate calculations USE_OLD_LRP_METHOD (Bool) default: True DRY_RUN (Bool) default: False Set to True for a dry run to check input parameters. MAX_MEMORY_GB (longlong) default: 0 Try not to use more than this amount of memory. To see additional special arguments, type: Discovar --help special
To find out on how to use DiscovarDeNovo you can just type in DiscovarDeNovo --help to print out a help text to get you started:
Performing re-exec to adjust stack size. Usage: DiscovarDeNovo arg1=value1 arg2=value2 ... DISCOVAR de novo (experimental) is a de novo genome assembler that requires only a single PCR-free paired end Illumina library containing 250 base reads. Required arguments: READS (String) Comma-separated list of input files, see manual for details OUT_DIR (String) name of output directory Optional arguments: NUM_THREADS (unsigned int) default: 0 Number of threads. By default, the number of processors online. REFHEAD (String) use reference sequence REFHEAD.fasta to annotate assembly, and also REFHEAD.names if it exists MAX_MEM_GB (double) default: 0 if specified, maximum allowed RAM use in GB; in some cases may be exceeded by our code MEMORY_CHECK (Bool) default: False if True, attempt to determine actual available memory and cap memory usage accordingly; slow and can cause machine to become very sluggish, or can result in process being killed To see additional special arguments, type: DiscovarDeNovo --help special
Documentation
The full documentation can be found here.