High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
LocusZoom on Biowulf & Helix

Description

LocusZoom is a tool for visualizing results of genome wide association studies at an individual locus along with other relevant information like gene models, linkage disequillibrium coefficients, and estimated local recombination rates.

LocusZoom uses association results in METAL or EPACTS formatted files along with it's own source of supporting data (see below) to generate graphs.

References

Web sites

Supporting data

LocusZoom contains a number of data files used in generating graph annotations:

All data for LocusZoom is stored in /fdb/locuszoom/[version]. LocusZoom has been configured to automatically find all required information.

On Helix

To use LocusZoom, set up your environment with

helix$ module load R
helix$ module load locuszoom
helix$ ml

Currently Loaded Modules:
  1) tmux/1.9a   4) graphviz/2.34          7) openmpi/1.8.1/gcc-4.4.7-eth  10) R/3.2.0_gcc-4.4.7
  2) GSL/1.16    5) JAGS/3.4.0_gcc-4.4.7   8) tcl_tk/8.6.1_gcc-4.4.7       11) locuszoom/1.3
  3) tex/2014    6) gcc/4.4.7              9) ATLAS/3.8.4
helix$ locuszoom -h
locuszoom -h
+---------------------------------------------+
| LocusZoom 1.3 (06/20/2014)                  |
| Plot regional association results           |
| from GWA scans or candidate gene studies    |
+---------------------------------------------+

usage: locuszoom [options]

  -h, --help
    show this help message and exit

  --metal <string>
    Metal file.
[...snip...]

Draw a diagram of the associations between SNPs and HDL observed in Kathiresan et al, 2009 around the FADS1 gene

helix$ locuszoom \
    --metal /usr/local/apps/locuszoom/TEST_DATA/examples/Kathiresan_2009_HDL.txt \
    --refgene FADS1
LocusZoom output
Batch job on Biowulf

As usual, create a batch script for submission with sbatch:

#! /bin/bash
#SBATCH --mail-type=END
# this is locuszoom.sh
set -e
function fail {
  echo "$@" >&2
  exit 1
}

module load R || fail "Could not load R module"
module load locuszoom || fail "Could not load locuszoom module"


mf=/usr/local/apps/locuszoom/TEST_DATA/examples/Kathiresan_2009_HDL.txt
locuszoom --metal=$mf --refgene FADS1 --pop EUR --build hg19 \
  --source 1000G_March2012 \
  --gwas-cat whole-cat_significant-only

And submit to the queue with

biouwlf2$ sbatch locuszoom.sh
Swarm of jobs on Biowulf

Create a swarm command file with one command per line (line continuations allowed

locuszoom --metal=/usr/local/apps/locuszoom/TEST_DATA/examples/Kathiresan_2009_HDL.txt \
  --refgene FADS1
locuszoom --metal=/usr/local/apps/locuszoom/TEST_DATA/examples/Kathiresan_2009_HDL.txt \
  --refgene PLTP
locuszoom --metal=/usr/local/apps/locuszoom/TEST_DATA/examples/Kathiresan_2009_HDL.txt \
  --refgene ANGPTL4

And submit using swarm

biowulf$ swarm -f swarmfile
Interactive job on Biowulf

Interactive R sessions are not allowed on the biowulf login nodes. Therefore LocusZoom cannot be used on the login node either. For interactive exploration use helix or an interactive session on a compute node:

biowulf$ sinteractive
salloc.exe: Granted job allocation 788658
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn0135 are ready for job
srun: error: x11: no local DISPLAY defined, skipping
cn0135$ locuszoom \
    --metal=/usr/local/apps/locuszoom/TEST_DATA/examples/Kathiresan_2009_HDL.txt \
    --refgene FADS1
[...snip...]
cn0135$ exit
salloc.exe: Relinquishing job allocation 788658
salloc.exe: Job allocation 788658 has been revoked.
biowulf$
Documentation