Juicer 2016

From HPC users
Revision as of 18:55, 23 November 2021 by Schwietzer (talk | contribs) (Created page with "== Introduction == Juicer is a platform for analyzing kilobase resolution Hi-C data. This distribution includes the pipeline for generating Hi-C maps from fastq raw data files...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Introduction

Juicer is a platform for analyzing kilobase resolution Hi-C data. This distribution includes the pipeline for generating Hi-C maps from fastq raw data files and command line tools for feature annotation on the Hi-C maps. 1

On our cluster CARL, Juicer automatically loads the module Juicebox, which includes java executables for the visualization software for Hi-C data juicebox as well as for juicer_tools

Installed version(s)

The following versions are installed and currently available on environment hpc-env/8.3:

  • Juicebox/2.13.07-GCC-8.3.0-CUDA-11.4.2
  • Juicer/ENCODE-2021_11-GCC-8.3.0-CUDA-11.4.2 #### most current branch on github at the time of compiling
  • Juicer/1.6-GCC-8.3.0-CUDA-11.4.2


Loading Juicer

To load the desired version of the module, use the module load command, e.g.

module load hpc-env/8.3
module load Juicer/ENCODE-2021_11-GCC-8.3.0-CUDA-11.4.2


Using Juicetools & Juicebox

As stated above, Juicer automatically loads Juicebox as a dependent module. This dependency contains two java files ( juicebox and juicer_tools) as well as two shell executables, which make using the java files easyly accessible.

To run the visualization tool juicebox, you only have to load the module and type in juicebox , assuming you are logged in with X11 display support to forward the GUI from the cluster to your screen.

juicer_tools is a command line tool which prints out the following help text to get you started:

$ juicer_tools
WARN [2021-11-23T18:23:46,540]  [Globals.java:138] [main]  Development mode is enabled
Juicer Tools Version 2.13.07
Usage:
	dump <observed/oe> <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr1>[:x1:x2] <chr2>[:y1:y2] <BP/FRAG> <binsize> [outfile]
	dump <norm/expected> <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr> <BP/FRAG> <binsize> [outfile]
	dump <loops/domains> <hicFile URL> [outfile]
	pre [options] <infile> <outfile> <genomeID>
	addNorm <input_HiC_file> [input_vector_file]
	pearsons [-p] <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr> <BP/FRAG> <binsize> [outfile]
	eigenvector -p <NONE/VC/VC_SQRT/KR> <hicFile(s)> <chr> <BP/FRAG> <binsize> [outfile]
	apa <hicFile(s)> <PeaksFile> <SaveFolder>
	arrowhead <hicFile(s)> <output_file>
	hiccups <hicFile> <outputDirectory>
	hiccupsdiff <firstHicFile> <secondHicFile> <firstLoopList> <secondLoopList> <outputDirectory>
	validate <hicFile>
	-h, --help print help
	-v, --verbose verbose mode
	-V, --version print version
Type juicer_tools <commandName> for more detailed usage instructions

Using Juicer

To start off, it must be noted that Juicer is mostly based on highly individualized shell scripts which were designed to run on a specific machine that isn't our CARL or EDDY cluster. Additionally, Juicer is made to be built preferrably inside of a user directory. In practise, this means to our users, that the scripts might not fit out of the box and need partial adjustments to make them usable on our cluster. Although we tried to adjust some standard paths to the right target files and folders, some scripts might point to non existing paths. Keeping that in mind, here's how Juicer works on our cluster:


When loading our Juicer module the first time, it will create a folder in your $HOME directory, called juiceDir, cointaining three subdirectories: references, scripts and restriction_sites. At scripts you will find a compilation of different, already slightly adjusted scripts for different tasks. Most of the scripts will include a help function (mostly callable by <script.sh> -h or <script.sh> --help )


Documentation

The full documentation can be found [ here].