svtk on Biowulf
SVTK: Utilities for consolidating, filtering, resolving, and annotating structural variants. It was wrote by Talkowski lab.
References:
- Ryan L. collins et.al. A structural variation reference for medical and population genetics. Nature 2020 PubMed | Journal
Documentation
- Source code repository: on GitHub
Important Notes
- Module Name: svtk (see the modules page for more information)
- Example data in
$SVTK_TEST_DATA
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=4g salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load svtk [+] Loading svtk 0.1 [user@cn3144 ~]$ svtk usage: svtk [-h][options] [ Preprocessing ] standardize Convert SV calls to a standardized format. rdtest2vcf Convert an RdTest-formatted bed to a standardized VCF. vcf2bed Convert a standardized VCF to an RdTest-formatted bed. [ Algorithm integration ] vcfcluster Cluster SV calls from a list of VCFs. (Generally PE/SR.) bedcluster Cluster SV calls from a BED. (Generally depth.) [ Statistics ] count-svtypes Count instances of each svtype in each sample in a VCF [ Read-depth analysis ] bincov Calculate normalized genome-wide depth of coverage. rdtest* Calculate comparative coverage statistics at CNV sites. [ PE/SR analysis ] collect-pesr Count clipped reads and extract discordant pairs genomewide. sr-test Calculate enrichment of clipped reads at SV breakpoints. pe-test Calculate enrichment of discordant pairs at SV breakpoints. [ Variant analysis ] resolve Resolve complex variants from VCF of breakpoints. annotate Annotate genic effects and ovelrap with noncoding elements. [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Batch job
Most jobs should be run as batch jobs.
Create a batch input file (e.g. svtk.sh), which uses the input file 'svtk.in'. For example:
#! /bin/bash module load svtk || exit 1 svtk[options]
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=2 --mem=10g --gres=lscratch:20 svtk.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.
Create a swarmfile (e.g. svtk.swarm). For example:
svtk[options] svtk [options]
Submit this job using the swarm command.
swarm -f svtk.swarm -g 10 --module svtk --gres=lscratch:10where
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module svtk | Loads the svtk module for each subjob in the swarm |