High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed

PePr is a ChIP-Seq Peak-calling and Prioritization pipeline that uses a sliding window approach and models read counts across replicates and between groups with a negative binomial distribution. PePr empirically estimates the optimal shift/fragment size and sliding window width, and estimates dispersion from the local genomic area. Regions with less variability across replicates are ranked more favorably than regions with greater variability. Optional post-processing steps are also made available to filter out peaks not exhibiting the expected shift size and/or to narrow the width of peaks.


There are multiple versions of PePr available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail PePr

To select a module, type

module load PePr/[ver]

where [ver] is the version of choice.

Environment variables set:

Interactive use

Sample session:

module load PePr
PePr -i input_rep1.bed,input_rep2.bed -c chip_rep1.bed,chip_rep2.bed -f bed -s 45 -w 180 -n my_test_run
Batch job on Biowulf

Create a batch input file (e.g. PePr.sh). For example:

module load PePr
PePr -c SRR446029_1.fastq_trim.gz.bam,SRR446030_1.fastq_trim.gz.bam \
  --chip2 SRR446031_1.fastq_trim.gz.bam,SRR446032_1.fastq_trim.gz.bam -f bam --diff -s 10

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=1 PePr.sh
Swarm of Jobs on Biowulf

Create a swarmfile (e.g. PePr.swarm). Make sure to include -n to ensure all output files are unique. For example:

PePr -c ex1_A.bam,ex1_B.bam --chip2 ex2_A.bam,ex2_B.bam -f bam --diff -n ex1_ex2
PePr -c ex1_A.bam,ex1_B.bam --chip2 ex3_A.bam,ex3_B.bam -f bam --diff -n ex1_ex3
PePr -c ex1_A.bam,ex1_B.bam --chip2 ex4_A.bam,ex4_B.bam -f bam --diff -n ex1_ex4
PePr -c ex1_A.bam,ex1_B.bam --chip2 ex5_A.bam,ex5_B.bam -f bam --diff -n ex1_ex5

Submit this job using the swarm command.

swarm -f PePr.swarm