Biowulf High Performance Computing at the NIH
AncestryMap on Biowulf

AncestryMap is a software package that allows finding skews in ancestry that are potentially associated with disease genes in recently mixed populations.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load AncestryMap
[+] Loading AncestryMap 6210 ...

[susanc@cn3144 examples]$ cp $AMT_DATA/* .
Executable ancestrymap can be run by passing it a parameter file with command line option -p:
[susanc@cn3144 examples]$ ancestrymap -p param0
parameter file: param0
output: (null)
### THE INPUT PARAMETERS
PARAMETER NAME: VALUE
risk: 1.5
indivname: indiv.dat
snpname: snpcnts
genotypename: geno.dat
tlreest: YES
seed: 1011
splittau: YES
fancyxtheta: YES
checkit: YES
details: YES
numburn: 0
numiters: 0
emiter: 10
dotoysim: NO
cleaninit: YES
reestiter: 5
indoutfilename: indjunk
snpoutfilename: snpjunk
## ANCESTRYMAP version: 6210

###GENETIC DISTANCE FOR ALL CHROMOSOMES
##Chr_Num: chromosome num, First_SNP and Last_SNP: First and last markers, Gen_dist: Genetic distance
         Chr_Num    First_SNP     Last_SNP           Gen_dist
chrom:     1  first:     0  last:  478 distance:     2.834
chrom:     2  first:   479  last:  910 distance:     2.647
chrom:     3  first:   911  last:  1276 distance:     2.227
...
chrom:    20  first:  5330  last:  5492 distance:     1.067
chrom:    21  first:  5493  last:  5586 distance:     0.604
chrom:    22  first:  5587  last:  5699 distance:     0.711
chrom:    23  first:  5700  last:  5902 distance:     1.180
total distance:    36.242
calling setstatus

emiter: 10
reestiter: 5
trashdir: /var/tmp
###HETXCHECK RESULTS BEGIN
##                      SNP_ID  NUM_HET NUM_HOMOZY
hetxcheck            rs6530109      0      600
hetxcheck            rs2128516      0      600
...
hetxcheck           rs10127175      0      600
hetxcheck            rs5945413      0        0
hetxcheck             rs884840      0      600

###COUNTS
Num of fake Markers: 3622  Num of real Markers: 2281 Spacing between fake markers:     0.010
Num of Markers: 5903   Num of Samples:  1201
Num of Cases:  601 Num of Controls:  600   Num of Ignored Samples: 0
dup? toyindiv:539 toyindiv:1200
 match: 2260 mismatch: 0   2260 2260
dup.  toyindiv:539 ignored

### CHECKGENO RESULTS FOLLOW:
Num good genotypes: 2714260  Num bad genotypes:  0

###PHYSCHECK RESULTS FOLLOW:
##        SNP1_ID        SNP2_ID   SNP1_GEN_POS   SNP2_GEN_POS  SNP1_PHYS_POS  SNP2_PHYS_POS
###HWCHECK RESULTS FOLLOW:
##                   SNP_ID  SNP_INDEX    CHR_NUM   HW_SCORE
hwcheck             rs819980     0        1       -1.595
hwcheck           rs10907185     1        1        0.376
hwcheck             rs897634     4        1       -1.861
...
hwcheck            rs4824056  5694       22       -2.308
hwcheck             rs138823  5695       22       -2.112
hwcheck            rs6520141  5696       22       -1.517
hwcheck            rs8142477  5698       22       -2.327
hwcheck             rs140514  5699       22       -0.021

hwstats (chrom  1) ave:    -1.110 sigma:   -15.543
hwstats (chrom  2) ave:    -1.330 sigma:   -17.183
hwstats (chrom  3) ave:    -1.278 sigma:   -15.286
hwstats (chrom  4) ave:    -1.003 sigma:   -11.304
hwstats (chrom  5) ave:    -1.322 sigma:   -15.698
hwstats (chrom  6) ave:    -1.071 sigma:   -11.536
hwstats (chrom  7) ave:    -1.088 sigma:   -12.016
hwstats (chrom  8) ave:    -1.157 sigma:   -12.247
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. AncestryMap.sh). For example:

#!/bin/bash
# this file is called admix.sh

module load AncestryMap

cp $AMT_DATA/* .

ancestrymap -p param0 > data0.out
ancestrymap -p param1 > data1.out
ancestrymap -p param2 > data2.out

Submit this job using the Slurm sbatch command.

sbatch  [--mem=#] AncestryMap.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. AncestryMap.swarm). For example:

cp $AMT_DATA/* .
ancestrymap -p param0 > data0.out
ancestrymap -p param1 > data1.out
ancestrymap -p param2 > data2.out

Submit this job using the swarm command.

swarm -f AncestryMap.swarm [-g #] --module AncestryMap
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
--module AncestryMap Loads the AncestryMap module for each subjob in the swarm