adVNTR on Biowulf

Quick Links

adVNTR is a tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data. It works with both NGS short reads (Illumina HiSeq) and SMRT reads (PacBio) and finds diploid repeating counts for VNTRs and identifies possible mutations in the VNTR sequences.

References:

Bakhtiari, M., Shleizer-Burko, S., Gymrek, M., Bansal, V. and Bafna, V. Targeted genotyping of variable number tandem repeats with adVNTR. Genome Research, 28(11), pp.1709-1719. Journal

Documentation

advntr on GitHub

Important Notes

Module Name: advntr (see the modules page for more information)
Example files in $ADVNTR_TEST_DATA
The warning messages at the top regarding cudart library when running advntr can be safely ignored.

Interactive job

Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --mem=5g --gres=lscratch:20
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144]$ cd /lscratch/$SLURM_JOB_ID
[user@cn3144]$ module load advntr
[user@cn3144]$ cp ${ADVNTR_TEST_DATA} .
[user@cn3144]$ cd TEST_DATA
[user@cn3144]$ mkdir log_dir
[user@cn3144]$ advntr genotype --vntr_id 301645 --alignment_file CSTB_2_5_testdata.bam --working_directory log_dir/
    2021-04-30 18:46:02.676445: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
    301645
    2/2

Batch job

Most jobs should be run as batch jobs.

Create a batch script file (e.g. advntr.sh). For example:

#!/bin/bash
cd /lscratch/$SLURM_JOB_ID
module load advntr
cp $ADVNTR_TEST_DATA .
cd TEST_DATA
advntr genotype --vntr_id 301645 --alignment_file CSTB_2_5_testdata.bam --working_directory log_dir/
....
....

Submit this job using the Slurm sbatch command.

sbatch --mem=10g --gres=lscratch:20 advntr.sh