Biowulf High Performance Computing at the NIH
combp on Biowulf

Combp is a library to combine, analyze, group and correct p-values in BED files. Unique tools involve correction for spatial autocorrelation. This is useful for ChIP-Seq probes and Tiling arrays, or any data with spatial correlation.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive --cpus-per-task=2 --mem=2G
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load combp
[user@cn3144 ~]$ comb-p
Tools for viewing and adjusting p-values in BED files.

   Contact: Brent Pedersen - bpederse@gmail.com
   License: BSD

To run, indicate one of:

   acf       - calculate autocorrelation within BED file
   slk       - Stouffer-Liptak-Kechris correction of correlated p-values
   fdr       - Benjamini-Hochberg correction of p-values
   peaks     - find peaks in a BED file.
   region_p  - generate SLK p-values for a region (of p-values)
   filter    - filter region_p output on size and p and add coef/t
   hist      - plot a histogram of a column and check for uniformity.
   manhattan - a manhattan plot of values in a BED file.
   pipeline  - run acf, slk, fdr, peaks, region_p in succesion

NOTE: most of these assume *sorted* BED files.
SEE: https://github.com/brentp/combined-pvalues for documentation

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. combp.sh). For example:

#!/bin/bash
#SBATCH --cpus-per-task=2
#SBATCH --mem=2G
#SBATCH --time=2:00:00
#SBATCH --partition=norm

set -e
module load combp
cp $COMBP_TESTDATA/pvals.bed .
comb-p acf -d 1:500:50 -c 5 pvals.bed > acf.txt

Submit the job:

sbatch combp.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. job.swarm). For example:


       cd dir1; comb-p peaks --seed 0.05 --dist 1000 pvals1.bed >pvals.peaks1.bed
       cd dir2; comb-p peaks --seed 0.05 --dist 1000 pvals2.bed >pvals.peaks2.bed

    

Submit this job using the swarm command.

swarm -f job.swarm [-g #] --module combp
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
--module Loads the module for each subjob in the swarm