High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
CEAS on Biowulf & Helix

CEAS is a tool designed to characterize genome-wide protein-DNA interaction patterns from ChIP-chip and ChIP-Seq of both sharp and broad binding factors. It provides statistics on ChIP enrichment at important genome features such as specific chromosome, promoters, gene bodies, or exons, and infers genes most likely to be regulated by a binding factor. CEAS also enables biologists to visualize the average ChIP enrichment signals over specific genomic features, allowing continuous and broad ChIP enrichment to be perceived which might be too subtle to detect from ChIP peaks alone.

Example files can be copied from :

cp /fdb/CEAS/H3K36me3* /home/$USER/ceas

Precompiled hg19, hg18, mm9, mm8 gene annotation tables are under /fdb/CEAS :

Running on Helix

$ module load ceas
$ cd /data/$USER/ceas
$ ceas --name=H3K36me3_ceas --pf-res=20 --gn-group-names='Top 10%,Bottom 10%' \
	-g /fdb/CEAS/hg18.refGene -b H3K36me3_MACS_pval1e-5_peaks.bed -w H3K36me3.wig

Running a single batch job on Biowulf

1. Create a script file. The file will contain the lines similar to the lines below.


module load ceas
cd /data/$USER/ceas
ceas --name=H3K36me3_ceas --pf-res=20 --gn-group-names='Top 10%,Bottom 10%'\
  -g /fdb/CEAS/hg18.refGene -b H3K36me3_MACS_pval1e-5_peaks.bed -w H3K36me3.wig

2. Submit the script on biowulf:

$ sbatch jobscript
For more memory requirement (default 4gb), use --mem=Mg flag:
$ sbatch --mem=10g jobscript

Running a swarm of jobs on Biowulf

Setup a swarm command file:

  cd /data/$USER/dir1; ceas commands
  cd /data/$USER/dir2; ceas commands
  cd /data/$USER/dir3; ceas commands

Submit the swarm file:

  $ swarm -f swarmfile --module ceas

-f: specify the swarmfile name
--module: load the required module for each command line in the file

To allocate more memory:

  $ swarm -f swarmfile -g 20 --module ceas

-g: allocate more memory

For more information regarding running swarm, see swarm.html

Running an interactive job on Biowulf

It may be useful for debugging purposes to run jobs interactively. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.

biowulf$ sinteractive
salloc.exe: Granted job allocation 16535

cn999$ module load ceas
cn999$ cd /data/$USER/dir
cn999$ ceas commands

cn999$ exit


Make sure to exit the job once finished.

If more memory is needed, use --mem. For example

biowulf$ sinteractive --mem=20g