GEM: High resolution peak calling and motif discovery for ChIP-seq and ChIP-exo data.
GEM is a scientific software for studying protein-DNA interaction at high
resolution using ChIP-seq/ChIP-exo data. It can also be applied to CLIP-seq
and Branch-seq data.
GEM links binding event discovery and motif discovery with positional priors
in the context of a generative probabilistic model of ChIP data and genome
sequence, resolves ChIP data into explanatory motifs and binding events
at unsurpassed spatial resolution. GEM reciprocally improves motif discovery
using binding event locations, and binding event predictions using discovered
motifs.
GEM has following features:
$GEMJARPATH
$GEMJAR
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive -c 10 --mem 10g salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ module load gem [user@cn3144]$ cp -r ${GEM_TEST_DATA:-none}/ . [user@cn3144]$ java -Xmx10g -jar $GEMJAR --t 8 --d Read_Distribution_default.txt \ --g mm10.chrom.sizes \ --genome /fdb/igenomes/Mus_musculus/UCSC/mm10/Sequence/Chromosomes/ \ --s 2000000000 --expt SRX000540_mES_CTCF.bed --ctrl SRX000543_mES_GFP.bed \ --f BED --out mouseCTCF --k_min 6 --k_max 13
[user@cn3144]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. batch.sh). For example:
#!/bin/bash set -e module load gem java -Xmx10g -jar $GEMJAR --t 8 --d $GEM_TEST_DATA/Read_Distribution_default.txt \
--g $GEM_TEST_DATA/mm10.chrom.sizes \
--genome /fdb/igenomes/Mus_musculus/UCSC/mm10/Sequence/Chromosomes/ \
--s 2000000000 --expt SRX000540_mES_CTCF.bed --ctrl SRX000543_mES_GFP.bed \
--f BED --out mouseCTCF --k_min 6 --k_max 13
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=10 --mem=10g batch.sh
Create a swarmfile (e.g. job.swarm). For example:
cd dir1; gem command cd dir2; gem command cd dir3; gem command
Submit this job using the swarm command.
swarm -f job.swarm -g 10 -t 4 --module gemwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module | Loads the module for each subjob in the swarm |