From the gopeaks documentation:
GoPeaks is a peak caller designed for CUT&TAG/CUT&RUN sequencing data. GoPeaks by default works best with narrow peaks such as H3K4me3 and transcription factors. However, broad epigenetic marks like H3K27Ac/H3K4me1 require different the step, slide, and minwidth parameters.
Allocate an interactive session and run the program.
Sample session (user input in bold) for running the peakfinder with a control and generating a summary plot with deeptools:
[user@biowulf]$ sinteractive --cpus-per-task=6 --gres=lscratch:20 --mem=30g salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load gopeaks [user@cn3144 ~]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144 ~]$ cp $GOPEAKS_TEST_DATA/* . [user@cn3144 ~]$ ls -lh [user@cn3144 ~]$ gopeaks -b GSE190793_Kasumi_cutrun.bam -c GSE190793_Kasumi_IgG.bam --mdist 1000 --prefix Kasumi_cnr Reading chromsizes from bam header... nTests: 431593 nzSignals: 4.5114732e+07 nzBins: 8298786 n: 7.566515e+06 p: 7.184688090883924e-07 mu: 5.962418894299423 var: 5.962414610487421 [user@cn3144 ~]$ cat Kasumi_cnr_gopeaks.json { "gopeaks_version": "1.0.0", "date": "2023-05-12 12:40:59 PM", "elapsed": "7m19.045741504s", "prefix": "Kasumi_cnr", "command": "gopeaks -b GSE190793_Kasumi_cutrun.bam -c GSE190793_Kasumi_IgG.bam --mdist 1000 --verbose --prefix=Kasumi_cnr", "peak_counts": 29393 } [user@cn3144 ~]$ # create a summary graph for peaks centered on the middle of the peak interval +- 1000 [user@cn3144 ~]$ # i.e. not gene annotation [user@cn3144 ~]$ module load deeptools [user@cn3144 ~]$ bamCoverage -p6 -b GSE190793_Kasumi_cutrun.bam -o GSE190793_Kasumi_cutrun.bw [user@cn3144 ~]$ bamCoverage -p6 -b GSE190793_Kasumi_IgG.bam -o GSE190793_Kasumi_IgG.bw [user@cn3144 ~]$ computeMatrix reference-point -R Kasumi_cnr_peaks.bed -a 1000 -b 1000 --referencePoint center \ -S GSE190793_Kasumi_cutrun.bw GSE190793_Kasumi_IgG.bw \ --sortRegions descend --samplesLabel 'Cut&Run' 'IgG' -p6 -o cutrun_matrix [user@cn3144 ~]$ plotHeatmap -m cutrun_matrix -o cutrun.png --averageTypeSummaryPlot mean --colorMap GnBu [user@cn3144 ~]$ cp Kasumi_cnr_* *.bw cutrun.png /data/$USER/my_working_directory [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. gopeaks.sh). For example:
#!/bin/bash set -e module load gopeaks/1.0.0 cp $GOPEAKS_TEST_DATA/* . gopeaks -b GSE190793_Kasumi_cutrun.bam -c GSE190793_Kasumi_IgG.bam --mdist 1000 --prefix Kasumi_cnr
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=6 --mem=30g gopeaks.sh
Create a swarmfile (e.g. gopeaks.swarm). For example:
gopeaks -b replicate1.bam -c control.bam --mdist 1000 --prefix replicate1_gopeaks gopeaks -b replicate2.bam -c control.bam --mdist 1000 --prefix replicate2_gopeaks
Submit this job using the swarm command.
swarm -f gopeaks.swarm -g 30 -t 6 --module gopeakswhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module gopeaks | Loads the gopeaks module for each subjob in the swarm |