Juicer is a system for analyzing loop-resolution Hi-C experiments. It was developed in the Aiden lab at Baylor College of Medicine/Rice University. [ Juicer website]

The juicer pipeline submits jobs to the cluster and then exits. It should therefore be run on the login node. Individual tools can be run as batch jobs or interactively.

Juicer depends on reference files (bwa index plus chromosome sizes file) and restriction enzyme files that are part of the central install. You can build additional references yourself or ask staff to generate them centrally. See $JUICER/references and $JUICER/restriction_sites for available reference data.

In addition to juicer, juicebox for visualizinig juicer contact maps is also available as a separate module. Note, however, that juicebox might be better run on your local desktop.

References:

Neva C. Durand, Muhammad S. Shamim, Ido Machol, Suhas S. P. Rao, Miriam H. Huntley, Eric S. Lander, and Erez Lieberman Aiden. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments.. Cell Systems 3, 2016. PubMed | PMC | Journal

The juicer pipeline works by creating a series of batch jobs and submitting them all at once using job dependencies. The main script is very light weight and has to be run on the login node.

First create a directory for your Juicer run. A subdirectory called 'fastq' within there should contain the fastq files. For example:

The fastq files must be called filename_R1.fastq[.gz] and filename_R2.fastq[.gz], as those names are built into the script. If your fastq files have different names, you can rename them or create symlinks.

For your first run we recommend using the test data, which can be copied from $JUICER_TEST_DATA and uncompressed:

Juicer will create subdirectories aligned, HIC_tmp, debug, splits. The HIC_tmp subdirectory will get deleted at the end of the run.

By default, running juicer.sh with no options will use the hg19 reference file, and the DpnII restriction site map:

After the juicer.sh script exits, you should see a set of jobs in running and queued state, with dependencies:

You can follow the progress of the job by watching the jobs proceed, and by examining the files in the debug subdirectory.

In addition to running the whole pipeline, individual batch jobs can also be run as usual. For example, to use the juicer_tools pre command to create a .hic format file from your own processed data create a batch script similar to the following (juicertools.sh):

We have modified juicer_tools to accept java options starting with -X. This allows users to run with larger memory - for example juicer_tools -Xmx48g ...

Some of the juicer_tools require GPUs and therefore have to be submitted to the gpu partition. For example (hiccups.sh):

The juicer pipeline is not suitable for an interactive job. However, individual tools can be run interactively. For this, allocate an interactive session and run the program. Sample session:

In addition, an interactive session can also be used to run juicebox, the visualization tool for HiC data. Note that X11 forwarding has to be set up for this to work which will be the case if the sinteractive session is started from an NX session.

The juicer pipeline isn't suitable for swarm. However, individual tools can be run as a swarm. For this, create a swarmfile (e.g. juicer.swarm). For example:

-g #	Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t #	Number of threads/CPUs required for each process (1 line in the swarm command file).
--module juicer	Loads the juicer module for each subjob in the swarm