High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Athlates on Biowulf & Helix

ATHLATES is a software package for determining HLA genotypes for individuals from Illumina exome sequencing data. ATHLATES has been tested with 101 base paired end reads using standard whole exome kits but should be applicable to any paired sequencing of similar or greater length and accuracy that covers the HLA exons with sufficient depth (including whole genome shotgun and RNA-Seq data).

ATHLATES was developed at the Broad Institute and MIT. [Athlates website]

Running Athlates on Helix

The following sample command will use the demo data provided with the program to run Athlates on exome-seq data of individual HG01756 from 1000 genome project.

helix% module load athlates

helix% typing -bam $ATHLATES_HOME/demo/HG01756/HG01756_a.sort.bam -exlbam $ATHLATES_HOME/demo/HG01756/HG01756_na.sort.bam -msa $ATHLATES_HOME/db/msa/A_nuc.txt -o ./output 

--------------------------------------------------------
Running cmd:
typing   -bam /usr/local/apps/athlates/Athlates_2014_04_26/demo/HG01756/HG01756_a.sort.bam -exlbam /usr/local/apps/athlates/Athlates_2014_04_26/demo/HG01756/HG01756_na.sort.bam -msa /usr/local/apps/athlates/Athlates_2014_04_26/db/msa/A_nuc.txt -hd 2 -o ./output
--------------------------------------------------------

        Obtain (un)paired-read names in /usr/local/apps/athlates/Athlates_2014_04_26/demo/HG01756/HG01756_na.sort.bam

                #s&p names: 17585, 71692

                create file: ./output.raw.fa

        Check BAM header of /usr/local/apps/athlates/Athlates_2014_04_26/demo/HG01756/HG01756_a.sort.bam

        [WARNING]: /usr/local/apps/athlates/Athlates_2014_04_26/demo/HG01756/HG01756_a.sort.bam sort order: unsorted

                Input BAM file should be sorted by queryname


                7623 ref sequence(s) found; avg len = 858

                0 platform(s) found: 

        Merge PE for /usr/local/apps/athlates/Athlates_2014_04_26/demo/HG01756/HG01756_a.sort.bam


                create file: ./output.unpair.fa


                create file: ./output.pair.fa

                Total, merged, unmerged, singleton:
                         116354, 3238(320 overlap), 0, 404

                target, non_target, ratio:
                7065    185     0.0261854

        Discard reads containing unique 50-mers
                        average freq = 33
        Assemble 3642 reads...

                iter: 0
                seed_k = 150    140     130     120     110     100     90      80      70      60     50
                iter: 1
                seed_k = 150    140     130     120     110     100     90      80      70      60     50
        10 contigs created 


                create file: ./output.contig.fa


                create file: ./output.contig.detail.txt

        Read in cDNA MSA: /usr/local/apps/athlates/Athlates_2014_04_26/db/msa/A_nuc.txt

                2010 refs in IMGT/HLA database;  MSA len = 1229

                Exon starts:    1       74      388     684     1027    1144    1177    1225

        Typing ...

                seed contigs...
                go through each cDNA...

                create file: ./output.typing.txt

Program finished successfully !

The output will appear in the current working directory.

-rw-r--r-- 1 susanc staff  871586 Jan 20 09:08 output.raw.fa
-rw-r--r-- 1 susanc staff       0 Jan 20 09:08 output.pair.fa
-rw-r--r-- 1 susanc staff 1030972 Jan 20 09:08 output.unpair.fa
-rw-r--r-- 1 susanc staff   16482 Jan 20 09:08 output.contig.fa
-rw-r--r-- 1 susanc staff  580282 Jan 20 09:08 output.contig.detail.txt
-rw-r--r-- 1 susanc staff    5448 Jan 20 09:08 output.typing.txt

Running an Athlates job on Biowulf

Set up a batch script along the following lines.

#!/bin/bash
#
# this file is called myjob.bat
#
module load athlates
cd /data/$USER/mydir
typing -bam $ATHLATES_HOME/demo/HG01756/HG01756_a.sort.bam  \
	-exlbam $ATHLATES_HOME/demo/HG01756/HG01756_na.sort.bam \
	-msa $ATHLATES_HOME/db/msa/A_nuc.txt -o ./output 

Submit this job with:

qsub -l nodes=1 myjob.bat

Running Athlates interactively on Biowulf

Allocate an interactive node, load the athlates module, and run the process. Sample session.

biowulf% qsub -I -l nodes=1
qsub: waiting for job 6395753.biobos to start
qsub: job 6395753.biobos ready

[susanc@p1465 ~]$ module load athlates

[susanc@p1465 ~]$ cd /data/$USER/mydir

[susanc@p1465 ~]$ typing -bam $ATHLATES_HOME/demo/HG01756/HG01756_a.sort.bam -exlbam $ATHLATES_HOME/demo/HG01756/HG01756_na.sort.bam -msa $ATHLATES_HOME/db/msa/A_nuc.txt -o ./output 

[...]
Program finished successfully !

[susanc@p1465 ~]$ exit
qsub: job 6395753.biobos completed

biowulf%

Documentation

Athlates User Manual