High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
treemix on Biowulf & Helix

Description

From the Treemix site:

TreeMix is a method for inferring the patterns of population splits and mixtures in the history of a set of populations. In the underlying model, the modern-day populations in a species are related to a common ancestor via a graph of ancestral populations. We use the allele frequencies in the modern populations to infer the structure of this graph.

There may be multiple versions of treemix available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail treemix 

To select a module use

module load treemix/[version]

where [version] is the version of choice.

Environment variables set

References

Documentation

On Helix
helix$ module load treemix
helix$ #copy example data to local directory
helix$ cp /usr/local/apps/treemix/TEST_DATA/treemix_test_files.tar.gz .
helix$ tar -xzf treemix_test_files.tar.gz
helix$ ls -lh
total 820K
-rw-r----- 1 user group  196 Feb 17  2012 pop_order_test
-rw-r----- 1 user group 402K Feb 17  2012 testin.gz
helix$ treemix -i testin.gz -o testout

TreeMix v. 1.12
$Revision: 231 $

npop:13 nsnp:29999
Estimating covariance matrix in 29999 blocks of size 1
SEED: 1455041772
Starting from:
(Lahu:0.00323769,(San:0.0494469,She:0.0102143):0.00323769);
Adding French [4/13]
[...snip...]

helix$ ls -lh
total 848K
-rw-r----- 1 user group  196 Feb 17  2012 pop_order_test
-rw-r----- 1 user group 402K Feb 17  2012 testin.gz
-rw-rw---- 1 user group  735 Feb  9 13:16 testout.cov.gz
-rw-rw---- 1 user group  688 Feb  9 13:16 testout.covse.gz
-rw-rw---- 1 user group  250 Feb  9 13:16 testout.edges.gz
-rw-rw---- 1 user group  117 Feb  9 13:16 testout.llik
-rw-rw---- 1 user group  722 Feb  9 13:16 testout.modelcov.gz
-rw-rw---- 1 user group  247 Feb  9 13:16 testout.treeout.gz
-rw-rw---- 1 user group  630 Feb  9 13:16 testout.vertices.gz

helix$ # copy file with r functions for plotting trees
helix$ cp /usr/local/apps/treemix/1.12/bin/plotting_funcs.R .
helix$ module load R
helix$ R
R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"      
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)                       
[...snip...]
> source("plotting_funcs.R")
> png(file="testout.png", width=1024, height=1024, res = 1024/7)
> plot_tree("testout", o="pop_order_test")
[...snip...]
> dev.off()
> q()
helix$ 

Creates the following tree

treemix example output
Batch job on Biowulf

Create a batch script similar to the following example:

#! /bin/bash
# this file is treemix_test.sh

module load treemix/1.12 || exit 1
treemix -i testin.gz -o testout \
    -global -bootstrap

Submit to the queue with sbatch:

b2$ sbatch treemix_test.sh
14257052
b2$ jobhist 14257052

JobId              : 14257052
User               : wresch
Submitted          : 20160209 14:13:50
Submission Path    : /spin1/users/wresch/test_data/treemix
Submission Command : sbatch treemix_test.sh


 Partition       State  Nodes  CPUs      Walltime       Runtime         MemReq  MemUsed  Nodelist
      norm   COMPLETED      1     2      00:30:00      00:00:10      2.0GB/cpu    0.0GB  cn0021

Swarm of jobs on Biowulf

Create a swarm command file similar to the following example:

treemix -i input1.gz -o output1
treemix -i input2.gz -o output2
treemix -i input3.gz -o output3

And submit to the queue with swarm

b2$ swarm -f treemix_starm.cmds
Interactive job on Biowulf

Allocate an interactive session with sinteractive and use as described above

b2$ sinteractive
node$ treemix -i intput1.gz -o output1
[...snip...]
node$ exit
b2$