High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Probabel on Biowulf & Helix

ProbABEL is a Tool for genome-wide association analysis of imputed genetic data. It was designed to perform such regression in fast, memory-efficient and consequently genome-wide feasible manner. Currently, ProbABEL implements linear, logistic regression, and Cox proportional hazards models.

Example files can be copied from:

$ cp -r /usr/local/apps/probabel/examples /data/$USER/

Running on Helix
$ module load probabel
$ cd /data/$USER/examples
$ palinear \
-p height.txt \
-d test.mldose \
-i test.mlinfo \
-m test.map \
-c 19 \
-o height_base

Running a single batch job on Biowulf

1. Create a script file similar to the lines below.

#!/bin/bash

module load probabel
cd /data/$USER/examples
palinear \
-p height.txt \
-d test.mldose \
-i test.mlinfo \
-m test.map \
-c 19 \
-o height_base

2. Submit the script on biowulf:

$ sbatch jobscript

For more memory requirement (default 4gb), use --mem flag:

$ sbatch --mem=10g jobscript

Running a swarm of jobs on Biowulf

Setup a swarm command file:

  cd /data/$USER/dir1; palinear -p height.txt -d test.mldose -i test.mlinfo -m test.map -c 19 -o height_base
  cd /data/$USER/dir2; palinear -p height.txt -d test.mldose -i test.mlinfo -m test.map -c 19 -o height_base
  cd /data/$USER/dir3; palinear -p height.txt -d test.mldose -i test.mlinfo -m test.map -c 19 -o height_base
	[......]
  

Submit the swarm file:

  $ swarm -f swarmfile --module probabel

-f: specify the swarmfile name
--module: set environmental variables for each command line in the file

To allocate more memory, use -g flag:

  $ swarm -f swarmfile -g 10 --module probabel

-g: allocate more memory

For more information regarding running swarm, see swarm.html

Running an interactive job on Biowulf

It may be useful for debugging purposes to run jobs interactively. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.

biowulf$ sinteractive 
salloc.exe: Granted job allocation 16535

cn999$ module load probabel
cn999$ cd /data/$USER/dir
cn999$ palinear -p height.txt -d test.mldose -i test.mlinfo -m test.map -c 19 -o height_base

cn999$ exit
exit

biowulf$

Make sure to exit the job once finished.

If more memory is needed, use --mem flag. For example

biowulf$ sinteractive --mem=10g

Documentation

http://genabel.org/packages/ProbABEL