Meme on Biowulf

The MEME Suite allows you to:

The Meme Suite was developed at U. Queensland and U. Washington. Meme website.

Meme is cpu-intensive for large numbers of sequences or long sequences and scales well to 128 cores.

Meme motif and GoMo databases are available in /fdb/meme/

meme

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --ntasks=4 --ntasks-per-core=1 --constraint=x2695
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$  cd /data/$USER

[user@cn3144 ~]$ module load meme

[user@cn3144]$ cp /usr/local/apps/meme/examples/protease-seqs  /data/$USER

[user@cn3144]$ meme -p $SLURM_NTASKS -text protease-seqs > protease.meme.out
IInitializing the motif probability tables for 2 to 7 sites...
nsites = 7
Done initializing.
SEEDS: highwater mark: seq 6 pos 300
BALANCE: samples 7 residues 1750 nodes 1 residues/node 1750

seqs=     7, min= 185, max=  300, total=     1750

motif=1
SEED WIDTHS: 8 11 15 21 29 41 50
em: w=  50, psites=   7, iter=   0

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

In the example below, we are using the file 'mini-drosoph.s' as the input. This file can be copied from /usr/local/apps/meme/examples. The maxsize parameter will be set to 600000 as described in the Important Notes section above.

Create a batch input file (e.g. meme.sh). For example:

#!/bin/bash
set -e

cd /data/$USER/
module load meme
meme -p $SLURM_NTASKS mini-drosoph.s  -oc meme_out -maxsize 600000

Submit this job using the Slurm sbatch command.

sbatch --ntasks=28  --ntasks-per-core=1 --constraint=x2695 --exclusive meme.sh
where

This job will run on the norm (default) partition, which is limited to single-node jobs. It will utilize all the cores on a 28-core norm node. Meme scales well and large meme jobs (maxsize ~500,000) can be submitted on up to 512 cores. A multinode job can be submitted with:

sbatch --partition=multinode --ntasks=512 --ntasks-per-core=1 --constraint=x2695  meme.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

You would submit a swarm of Meme jobs if you have several input files.

Meme is an MPI program which uses OpenMPI libraries. OpenMPI on Biowulf is built with Slurm support. An MPI program runs a specified number of MPI processes or 'tasks'. The user specifies the number of tasks with '--ntasks=#' on the sbatch command line, and the OpenMPI program automatically gets this number from Slurm and starts up the appropriate number of tasks.

Swarm is intended for single-threaded and multi-threaded applications. When you use the '-t #' (threads per process) flag to swarm, it sets up subjobs with $SLURM_CPUS_PER_TASK=# and allocates # cpus on a single node for each subjob. The Meme MPI program sees this as a single 'task' with #threads, and not as # tasks, and will complain that there are not enough slots available for the MPI processes.

Thus, it is important to add the flag --sbatch '--ntasks=# when submitting a swarm of Meme jobs. You should also use '--ntasks-per-core=1' as most MPI applications run with greater efficiency with only one MPI task on each physical core.

Create a swarmfile (e.g. Meme.swarm). For example:

meme -p $SLURM_NTASKS query1.fa -oc query1.out -maxsize 10000000
meme-chip -meme-p $SLURM_NTASKS query2.fna -oc query2out

Submit this job using the swarm command.

swarm -f swarm.cmd -g 20 --sbatch '--ntasks=4 --ntasks-per-core=1 --constraint=x2695' --module=meme