High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
phase on Biowulf & Helix

Description

PHASE reconstructs haplotypes from population genotype data using a Bayesian statistical model that considers the decay of LD with distance due to recombination. Inputs can include biallelic SNPs as well as multi-allelic loci like SNPs with more than two alleles, HLA allels, or microsatellites.

There may be multiple versions of phase available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail phase 

To select a module use

module load phase/[version]

where [version] is the version of choice.

Environment variables set

References

Documentation

On Helix

Load the PHASE module and analyze a simple test data set running ten times the number of default iterations:

helix$ module load phase/2.1.1
helix$ PHASE -X10 /usr/local/apps/phase/TEST_DATA/test.inp test.out
Reading in data
Reading Positions of loci
Reading individual      3
Finished reading
Computing matrix Q, please wait
Done computing Q
3
5
MSSSM
0 #1
12 0 0 1 3
11 1 1 0 3
0 #2
12 0 1 1 3
12 1 0 0 2
0 #3
12 1 0 1 2
12 0 1 0 13
Resolving with method R
Making List of all possible haplotypes
Method = R
Performing Final Set of Iterations... nearly there!
Performing Burn-in iterations
  50% done
Estimating recom rates
Continuing Burn-in
Performing Main iterations
Writing output to files
Producing Summary, please wait

helix$ ls -1 test.out*
test.out
test.out_freqs
test.out_hbg
test.out_monitor
test.out_pairs
test.out_probs
test.out_recom

The input format and options are described in the manual.

Batch job on Biowulf

Create a batch script similar to the following example:

#! /bin/bash

module load phase/2.1.1 || exit 1
PHASE -X10 input output

Submit to the queue with sbatch:

b2$ sbatch phase.batch
Swarm of jobs on Biowulf

Create a swarm command file similar to the following example:

PHASE -X10 input1 output1
PHASE -X10 input2 output2

And submit to the queue with swarm

b2$ swarm -f phase.swarm --module phase/2.1.1
Interactive job on Biowulf

Allocate an interactive session with sinteractive and use as described above

b2$ sinteractive 
node$ module load phase/2.1.1
node$ PHASE -X10 /usr/local/apps/phase/TEST_DATA/test.inp test.out
[...snip...]
node$ exit
b2$