Biowulf High Performance Computing at the NIH
bali-phy on Biowulf

BAli-Phy can co-estimate alignments (nucleotide, codon, or amino acid) and phylogenetic trees with complex substitution models. It uses Markov chain Monte Carlo (MCMC) based methods.

The package contains the main executable (bali-phy) as well as a number of small command line utilities.


Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --mem=10g
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144]$ module load bali-phy
[user@cn3144]$ cp $BALIPHY_TEST_DATA/sequences/5S-rRNA/5d.fasta .
[user@cn3144]$ alignment-info 5d.fasta
Alignment: 126 columns of 5 sequences         Alphabet: DNA
  sequence lengths: 120-126      mean = 122      median = 121

 ====== w/o indels ======
  const.: 5 (3.97%)      non-const.: 121 (96%)      inform.: 51 (40.5%)
  21.5% minimum sequence identity.

 ====== w/  indels ======
  const.: 3 (2.38%)      non-const.: 123 (97.6%)      inform.: 54 (42.9%)
  21% minimum sequence identity.

 ========   gaps ========
  6 (4.76%) sites contain a gap.
  2.86% of the matrix is gaps.
  3 indel groups seem to exist. (3 separate)
       unique/inform. = 3/0       ins./del. = 1/2
  gap lengths: 2-6      mean = 4.33      median = 5

Stop Codons:  8/11/2

Frequencies:   A=18.9%  C=30.6%  G=32.6%  T=17.9%
  Classes:  0 [0%]        Wildcards: 0 [0%]

Get some basic info about bali-phy and do a short MCMC run with few iterations

[user@cn3144]$ bali-phy -v

VERSION: 3.5  [HEAD -> master, origin/master, origin/HEAD commit e553c93d0]  (Mar 24 2020 10:10:08)
BUILD: Mar 24 2020 19:12:18
ARCH: linux x86_64
COMPILER: gcc 9.2.1 x86_64

# test without running any MCMC interations
[user@cn3144]$ bali-phy 5d.fasta --test 
T:topology ~ uniform on tree topologies
T:lengths ~ iid[num_branches[T],gamma[0.5,div[2,num_branches[T]],0.0]]

#1: alphabet = DNA
#1: subst = tn93
                tn93:kappaPur ~ log_normal[log[2],0.25]
                tn93:kappaPyr ~ log_normal[log[2],0.25]
                tn93:pi ~ dirichlet_on[letters[a],1] (S1)
#1: indel = rs07
                rs07:log_rate ~ laplace[-4,0.707]
                rs07:mean_length ~ exponential[10,1] (I1)
#1: scale ~ gamma[0.5,2,0.0] (Scale1)

# run a small number of iterations
[user@cn3144]$ bali-phy 5d.fasta --iterations=50

[user@cn3144]$ bali-phy help smodel
The --smodel command:

-S       [partitions:]               Substitution model.
--smodel [partitions:]

If no model is specified for a partition, then a default model is chosen
based on the alphabet for that partition.  The defaults are:

# explicitly select alphabet and substitution/indel models
[user@cn3144]$ bali-phy 5d.fasta --iterations=50 --alphabet=DNA --smodel=gtr+Rates.gamma+inv --imodel=RS07

[user@cn3144]$ exit
salloc.exe: Relinquishing job allocation 46116226

Batch job
Most jobs should be run as batch jobs.

From the BAli-Phy usage guide:

Running bali-phy on a computing cluster is not necessary, but can speed up the analysis dramatically. This is because a cluster allows you to run several independent MCMC chains simultaneously and pool the resulting samples. You can run multiple chains simultaneously simply by starting several different instances of bali-phy. Each instance of bali-phy runs only one chain and does not require using MPI or special command-line options. This approach to parallel computation is sometimes more efficient than MCMCMC-based parallelism involving heated chains. It is equivalent to running MCMCMC with no temperature difference between chains, with the exception that it allows results from all chains to be used, instead of just results from the single "cold" chain. Thus, if you run 10 independent chains in parallel, then you may gather samples 10 times faster that a single chain.

So, for example, take the following batch script (still using small numbers of iterations):

#! /bin/bash
# filename:
set -e

module load bali-phy/3.5|| exit 1

bali-phy $FA --iterations=2000

Submit two job using the Slurm sbatch command.

sbatch --cpus-per-task=2 --mem=4g --array=1-3

Note that bali-phy automatically sets up separate output directories. Once the MCMC runs are done, one can calculate a consensus tree:

trees-consensus 5d-1/C1.trees 5d-2/C1.trees 5d-3/C1.trees > consensus.tree
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

As discussed above, running many bali-phy jobs in parallel may be beneficial. In addition to the sbatch approach above, this can also be achieved with the swarm command which has the added benefit of making it simple to run a heterogeneous set of chains. For example

bali-phy /usr/local/apps/bali-phy/TEST_DATA/sequences/EF-Tu/5d.fasta --iterations=2000
bali-phy /usr/local/apps/bali-phy/TEST_DATA/sequences/EF-Tu/5d.fasta --iterations=2000
bali-phy /usr/local/apps/bali-phy/TEST_DATA/sequences/EF-Tu/5d.fasta --iterations=2000
bali-phy --smodel=gtr /usr/local/apps/bali-phy/TEST_DATA/sequences/EF-Tu/5d.fasta --iterations=2000
bali-phy --smodel=gtr /usr/local/apps/bali-phy/TEST_DATA/sequences/EF-Tu/5d.fasta --iterations=2000
bali-phy --smodel=gtr /usr/local/apps/bali-phy/TEST_DATA/sequences/EF-Tu/5d.fasta --iterations=2000

Submit this job using the swarm command.

swarm -f bali-phy.swarm -g 4 -t 1 -p 2 --module bali-phy
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module bali-phy Loads the bali-phy module for each subjob in the swarm