High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
pplacer on Helix/Biowulf
Pplacer places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability according to a reference alignment. Pplacer is designed to be fast, to give useful information about uncertainty, and to offer advanced visualization and downstream analysis.

pplacer was developed by the Matsen group at the Fred Hutchinson Cancer Research Center. See the pplacer website for references.

On Helix

While it is possible to run on Helix, this is likely to be much slower than either an interactive or batch sesssion on Biowulf. Sample session:

[user@helix src]$ cd /data/user/pplacer/

[user@helix pplacer]$ tar xvzf /usr/local/apps/pplacer/tutorial.tar.gz

[user@helix pplacer]$ cd fhcrc-microbiome-demo-730d268/

[user@helix fhcrc-microbiome-demo-730d268]$ module load pplacer
[+] Loading SQLite 3.15.0 for gcc 4.4.7 ...
[+] Loading pplacer 1.1

[user@helix fhcrc-microbiome-demo-730d268]$ sh pplacer_demo.sh
[...]
Batch job on Biowulf

Create a batch input file (e.g. run.sh). The following example uses the tutorial files provided with pplacer

#!/bin/bash
module load pplacer

mkdir /data/$USER/pplacer
cd /data/$USER/pplacer
tar xvzf /usr/local/apps/pplacer/tutorial.tar.gz
cd fhcrc-microbiome-demo-730d268

pplacer -c vaginal_16s.refpkg src/p4z1r36.fasta
[....other pplacer commands....]

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=2 --mem=4g run.sh
Swarm of Jobs on Biowulf

Create a swarmfile (e.g. guppy.swarm). For example:

guppy kr_heat -c vaginal_16s.refpkg/ file1a.jplace file2a.jplace
guppy kr_heat -c vaginal_16s.refpkg/ file1b.jplace file2b.jplace
guppy kr_heat -c vaginal_16s.refpkg/ file1c.jplace file2c.jplace

Submit this job using the swarm command.

swarm -f guppy.swarm 
Interactive job on Biowulf

Sample session: Allocate an interactive session and run the pplacer demo:

[user@biowulf ~]$ sinteractive
salloc.exe: Pending job allocation 36744333
salloc.exe: job 36744333 queued and waiting for resources
salloc.exe: job 36744333 has been allocated resources
salloc.exe: Granted job allocation 36744333
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn2493 are ready for job

[user@cn2493 ~] mkdir /data/$USER/pplacer; cd /data/$USER/pplacer

[user@cn2493 pplacer]$ tar xvzf /usr/local/apps/pplacer/tutorial.tar.gz
[...]

[user@cn2493 pplacer]$ cd fhcrc-microbiome-demo-730d268

[user@cn2493 pplacer]$ module avail pplacer

----------------------------------------- /usr/local/lmod/modulefiles --------------------------
   pplacer/1.1

[user@cn2493 pplacer]$ module load pplacer
[+] Loading pplacer 1.1

[user@cn2493 pplacer]$ sh pplacer_demo.sh
# Phylogenetic placement
# ----------------------

# This makes p4z1r2.jplace, which is a "place" file in JSON format.  Place files
# contain information about collections of phylogenetic placements on a tree.
# You may notice that one of the arguments to this command is
# `vaginal_16s.refpkg`, which is a "reference package". Reference packages are
# simply an organized collection of files including a reference tree, reference
# alignment, and taxonomic information. We have the beginnings of a
# [database](http://microbiome.fhcrc.org/apps/refpkg/) of reference packages
# and some [software](http://github.com/fhcrc/taxtastic) for putting them
# together.
pplacer -c vaginal_16s.refpkg src/p4z1r36.fasta
Running pplacer v1.1.alpha19-0-g807f6f3 analysis on src/p4z1r36.fasta...
Found reference sequences in given alignment file. Using those for reference alignment.
Pre-masking sequences... sequence length cut from 2196 to 275.
Determining figs... figs disabled.
Allocating memory for internal nodes... done.
Caching likelihood information on reference tree... done.
Pulling exponents... done.
Preparing the edges for baseball... done.
warning: rank below_species not represented in the lineage of any sequence in reference package vaginal_16s.
[...]
pause
Please press return to continue...



# That's it for the demo. For further information, please consult the
# [pplacer documentation](http://matsen.github.com/pplacer/).
echo "Thanks!"
Thanks!

[user@cn2493 fhcrc-microbiome-demo-730d268]$ exit
exit
salloc.exe: Relinquishing job allocation 36744333

Documentation