High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Pascal

Pascal (Pathway scoring algorithm) is an easy-to-use tool for gene scoring and pathway analysis from GWAS results. Pascal uses external data to estimate linkage disequilibrium. Therefore, the user only needs to supply genome wide SNP p-values. Pascal then derives p-values for genes and predefined pathways. Pascal doesn’t use Monte-Carlo simulation to derive gene p-values. This leads to increased speed and accuracy. This speed in the gene scoring is then leveraged to control the false positive rate in pathway scoring. For pathway scoring we implemented and tested enrichment strategies that compared very favorably compared to hypergeometric enrichment. This comparison was done on a large collection of GWAS results giving us confidence to recommend Pascal for downstream analysis of GWAS results. Pascal is mainly written in Java and has been tested on Unix systems and Mac OsX.

References:

Load the Pascal module:

module load Pascal

Environment variables set:

Pascal requires a file containing internal settings be present in the working directory ($PASCAL_HOME/settings.txt), as well as a directory containing reference files ($PASCAL_HOME/resources). These can be symlinked prior to running (see examples below). Alternatively, users can copy the original files and maintain their own versions.

On Helix

In this example, the default settings.txt and resources directory are symlinked into the working directory prior to running:

$ module load Pascal
$ ln -s $PASCAL_HOME/settings.txt .
$ ln -s $PASCAL_HOME/resources .
$ Pascal --pval=resources/gwas/EUR.CARDIoGRAM_2010_lipids.HDL_ONE.txt --chr=22

A directory output will be created (if it does not already exist), containing the results files:

$ ls output
EUR.CARDIoGRAM_2010_lipids.HDL_ONE.sum.genescores.chr22.txt   settingsOut.txt
EUR.CARDIoGRAM_2010_lipids.HDL_ONE.sum.numSnpError.chr22.txt

The default settings.txt file can be copied and edited to allow alternatives:

$ cp $PASCAL_HOME/settings.txt my_settings.txt
$ pico my_settings.txt

Then run Pascal, using the new settings file:

$ Pascal --set=my_settings.txt ...
Batch job on Biowulf

Create a batch input file (e.g. Pascal.sh), which uses the input file 'template.in'. For example:

#!/bin/bash
module load Pascal
ln -s $PASCAL_HOME/settings.txt .
ln -s $PASCAL_HOME/resources .
Pascal --pval=resources/gwas/EUR.CARDIoGRAM_2010_lipids.HDL_ONE.txt --chr=22

Submit this job using the Slurm sbatch command. Pascal works well with lots of memory, so be sure to allocate at least 8g:

$ sbatch --cpus-per-task=1 --mem=32g Pascal.sh
Swarm of Jobs on Biowulf

Pascal is not appropriate for use with swarm.

Interactive job on Biowulf

Once the interactive session has started, the steps are exactly the same as on Helix.

Documentation