High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Raremetal on Helix & Biowulf
RAREMETAL is a computationally efficient tool for meta-analysis of rare variants using sequencing or genotyping array data. RAREMETAL takes summary statistics and LD matrices generated by RAREMETALWORKER or rvtests, handles related and unrelated individuals, and supports both single variant and burden meta-analysis. RAREMETAL generates high quality plots by default and has options that allow users to build reports at different levels.

RAREMETALWORKER has the following features:

RAREMETAL and RAREMETALWORKER are developed by Shuang Feng, Dajiang Liu and Gonšalo Abecasis at the University of Michigan. Raremetal website

On Helix

Sample session:


[susanc@helix ~]$ module load raremetal/4.13.8-omp
[+] Loading Zlib 1.2.8 ...

[susanc@helix ~]$ raremetal

RAREMETAL 4.13.8 -- A Tool for Rare Variants Meta-Analyses for Quantitative Traits
          (c) 2012-2016 Shuang Feng, Dajiang Liu, Sai Chen, Goncalo Abecasis


Please go to "http://genome.sph.umich.edu/wiki/RAREMETAL" for the newest version.

Options:
       List of Studies : --summaryFiles [], --covFiles []
      Grouping Methods : --groupFile [], --annotatedVcf [], --annotation [],
                         --writeVcf
            QC Options : --hwe [1.0e-05], --callRate [0.95]
   Association Methods : --burden, --MB, --SKAT, --VT, --condition []
         Other Options : --labelHits, --geneMap [../data/refFlat_hg19.txt],
                         --correctGC, --prefix [], --maf [0.05], --longOutput,
                         --tabulateHits, --dosage, --hitsCutoff [1.0e-06],
                         --altMAF, --range [], --geneOnly
             PhoneHome : --noPhoneHome, --phoneHomeThinning [100]

Analysis started at: Fri Feb 26 14:08:17 2016
[...]

Batch job on Biowulf

Create a batch input file (e.g. raremetal.sh). For example:

#!/bin/bash
module load raremetal/4.13.8-omp

cd /data/$USER/mypedfiles
raremetalworker ---ped yourInput.ped --dat yourInput.dat --vcf yourInput.vcf.gz --traitName BMI --prefix yourFavoritePrefix --cpu $SLURM_CPUS_PER_TASK

Submit this job using the Slurm sbatch command. Note that the variable $SLURM_CPUS_PER_TASK is used within the batch file to specify the number of threads that the program should spawn. This variable is set by Slurm when the job runs, and matches the value specified in --cpus-per-task=# in the sbatch command below.

sbatch --cpus-per-task=4 raremetal.sh
Swarm of Jobs on Biowulf

Create a swarmfile (e.g.raremetal.swarm). For example:

# this file is called raremetal.swarm
raremetal --summaryFiles sum1 --prefix out1
raremetal --summaryFiles sum2 --prefix out2
raremetal --summaryFiles sum3 --prefix out3
[...]

Submit this job using the swarm command.

swarm -f raremetalswarm 

Interactive job on Biowulf
Allocate an interactive session and run raremetal. Sample session:
[susanc@biowulf ~]$ sinteractive
salloc.exe: Pending job allocation 15194042
salloc.exe: job 15194042 queued and waiting for resources
salloc.exe: job 15194042 has been allocated resources
salloc.exe: Granted job allocation 15194042
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn1719 are ready for job

[susanc@cn1719 ~]$ module load raremetal/4.13.8-omp
[+] Loading Zlib 1.2.8 ...

[susanc@cn1719 ~]$ raremetal --summaryFiles test.summary_files --prefix test
Documentation