Peakachu on HPC

A supervised learning framework for chromatin loop detection in genome-wide contact maps.

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive -c 4 --mem=8g --gres=lscratch:20
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load peakachu
[user@cn3144 ~]$ cd /lscratch/$SLURM_JOB_ID
[user@cn3144 ~]$ cp -r $PEAKACHU_DATA .
[user@cn3144 ~]$ time peakachu train -r 10000 -p TEST_DATA/Rao2014-GM12878-MboI-allreps-filtered.10kb.cool --balance -O models -b TEST_DATA/gm12878.mumbach.h3k27ac-hichip.hg19.bedpe
collecting from chr1
collecting from chr2
collecting from chr3
collecting from chr4
...
[CV] END class_weight=None, criterion=gini, max_depth=25, max_features=sqrt, n_estimators=100, n_jobs=1; total time=   7.9s
{'class_weight': None, 'criterion': 'gini', 'max_depth': 25, 'max_features': 'sqrt', 'n_estimators': 100, 'n_jobs': 1}
0.8398682330848078

real    24m49.558s
user    33m29.951s
sys     0m50.692s
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. batch.sh). For example:

#!/bin/bash
set -e
module load peakachu
peakachu train -r 10000 -p data.cool --balance -O models -b data.bedpe

Submit this job using the Slurm sbatch command.

sbatch batch.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. job.swarm). For example:

cd dir1; peakachu train -r 10000 -p data.cool --balance -O models -b data1.bedpe
cd dir2; peakachu train -r 10000 -p data.cool --balance -O models -b data2.bedpe
cd dir3; peakachu train -r 10000 -p data.cool --balance -O models -b data3.bedpe

Submit this job using the swarm command.

swarm -f job.swarm --module peakachu