Biowulf High Performance Computing at the NIH
mothur on Biowulf

mothur is a tool for analyzing 16S rRNA gene sequences generated on multiple platforms as part of microbial ecology projects.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --cpus-per-task=8 --mem=8g
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load mothur
[user@cn3144 ~]$ ln -s $MOTHUR_EXAMPLES/MiSeq_SOP/* .
[user@cn3144 ~]$ mothur

mothur v.1.39.5
Last updated: 3/20/2017

by
Patrick D. Schloss

Department of Microbiology & Immunology
University of Michigan
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

For questions and analysis support, please visit our forum at https://www.mothur.org/forum

Type 'quit()' to exit program

mothur > make.contigs(file=stability.files, processors=8)
...
mothur > summary.seqs(fasta=stability.trim.contigs.fasta)

Using 8 processors.

                Start   End     NBases  Ambigs  Polymer NumSeqs
Minimum:        1       248     248     0       3       1
2.5%-tile:      1       252     252     0       3       3810
25%-tile:       1       252     252     0       4       38091
Median:         1       252     252     0       4       76181
75%-tile:       1       253     253     0       5       114271
97.5%-tile:     1       253     253     6       6       148552
Maximum:        1       503     502     249     243     152360
Mean:   1       252.811 252.811 0.697867        4.44854
# of Seqs:      152360

Output File Names: 
stability.trim.contigs.summary

mothur > quit()

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. mothur.sh). For example:

#----- This file is MiSeq_SOP.sh -----#
#!/bin/bash

# Set the environment
module load mothur

# Untar the example data files
tar xzf $MOTHUR_EXAMPLES/MiSeq_SOP.tgz
cd MiSeq_SOP

# Run the batch script
mothur stability.batch

Submit this job using the Slurm sbatch command. Because the stability.batch file requires 8 processors, this job must request at least 8 processors:

sbatch --cpus-per-task=8 --mem=4g --job-name=MiSeq_SOP --output=MiSeq_SOP.out MiSeq_SOP.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. mothur.swarm). For example:

cd run1; mothur run1.batch
cd run2; mothur run2.batch
cd run3; mothur run3.batch
cd run4; mothur run4.batch

Submit this job using the swarm command.

swarm -f mothur.swarm [-g #] [-t #] --module mothur
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module mothur Loads the mothur module for each subjob in the swarm