vcflib is a C++ library for parsing Variant Call Format (VCF) files and a set of command line tools based on that library.
The following tools are currently available:
There may be multiple versions of vcflib available. An easy way of selecting the version is to use modules. To see the modules available, type
module avail vcflib
To select a module use
module load vcflib/[version]
where [version]
is the version of choice.
$PATH
$CPATH
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ vcf=/fdb/GATK_resource_bundle/hg19-2.8/CEUTrio.HiSeq.WGS.b37.bestPractices.hg19.vcf.gz [user@cn3144 ~]$ vcfsamplenames $vcf NA12878 NA12891 NA12892 [user@cn3144 ~]$ zcat $vcf | vcfcountalleles 12935193 [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch script similar to the following example:
#! /bin/bash function fail() { echo "$@" >&2 exit 1 } rb=/fdb/GATK_resource_bundle/hg19-2.8 module load vcflib || fail "could not load vcflib module" module load samtools/1.2 || fail "could not load samtools module" tabix -h ${rb}/CEUTrio.HiSeq.WGS.b37.bestPractices.hg19.vcf.gz chr1:1-100000 \ | vcf2tsv > CEUTrio.outSubmit to the queue with sbatch:
b2$ sbatch vcf2tsv.sh
Create a swarm command file similar to the following example:
vcfannotate -b enhancers.bed -k enh sample1.vcf > sample1_anno.vcf vcfannotate -b enhancers.bed -k enh sample2.vcf > sample2_anno.vcf
And submit to the queue with swarm
b2$ swarm -f vcfannotate.swarm -g 5