High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Spades on Biowulf & Helix

SPAdes – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies.

If you use SPAdes in your research, please include Nurk, Bankevich et al., 2013 in your reference list. You can also add Bankevich, Nurk et al., 2012 instead.

Running a single batch job on Biowulf

Set up a batch script along the following lines.

#!/bin/bash 

cd /data/$USER/mydir
module load spades
spades.py -t $SLURM_CPUS_PER_TASK [options] -o <output_dir>

Submit the script to the batch system with:

$ sbatch --mem=12g --cpus-per-task=4 --time=20:00:00 myscript

This command will submit your script to batch syste and request 12gb of memory, using 4 cpus and 20 hours of wall time.

You would, of course, modify these values to the needs of your job.

NOTE: make sure '-t=$SLURM_CPUS_PER_NODE' is set in the script and --cpus-per-task is set when submitting jobs since spade default is 16 cpus which will overload the node if not set.

 

Running a swarm of jobs on Biowulf

User can submit multiple jobs to the batch system using 'swarm'.

Set up a swarm command file (eg /data/$USER/cmdfile). Here is a sample file:

cd /data/$USER/mydir1; spades.py -t $SLURM_CPUS_PER_TASK [options] -o <output_dir>
cd /data/$USER/mydir2; spades.py -t $SLURM_CPUS_PER_TASK [options] -o <output_dir>
cd /data/$USER/mydir3; spades.py -t $SLURM_CPUS_PER_TASK [options] -o <output_dir>
[...]   

Submit this job with

$ swarm -f cmdfile -t 4 -g 12 --time 20:00:00 --module spades

-f : swarm file name
-g : memory in gb required per line of commands in swarm file
--time: default is 10 hours, use this option to request different amount of wall time.
-t: cpus required
--module : module used to setup environmental variables for the job.

Running an interactive job on Biowulf

Users may need to run jobs interactively sometimes. Such jobs should not be run on the Biowulf login node. Instead allocate an interactive node as described below, and run the interactive job there.

[user@biowulf]$ sinteractive --mem=12g --cpus-per-task=4 --time=4:00:00 
      salloc.exe: Granted job allocation 1528

[user@pXXXX]$ cd /data/$USER/myruns

[user@pXXXX]$ module load spades

[user@pXXXX]$ spades.py -t $SLURM_CPUS_PER_TASK [options] -o <output_dir>

[user@pXXXX] exit

[user@biowulf]$ 

Documentation

http://spades.bioinf.spbau.ru/release3.6.1/manual.html#sec4