CONTRA is a tool for copy number variation (CNV) detection for targeted resequencing data such as those from whole-exome capture data. CONTRA calls copy number gains and losses for each target region with key strategies include the use of base-level log-ratios to remove GC-content bias, correction for an imbalanced library size effect on log-ratios, and the estimation of log-ratio variations via binning and interpolation. It takes standard alignment formats (BAM/SAM) and output in variant call format (VCF 4.0) for easy integration with other next generation sequencing analysis package.
CONTRA uses environment modules. Type
module load CONTRA
at the prompt.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=8GB --cpus-per-task=2 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load CONTRA [user@cn3144 ~]$ contra.py \ -t $CONTRAHOME/Test_Files/0247401_D_BED_20090724_hg19_MERGED.bed \ -s $CONTRAHOME/Test_Files/P0667T_GATKrealigned_duplicates_marked.bam \ -c $CONTRAHOME/Test_Files/P0667N_GATKrealigned_duplicates_marked.bam \ -f /fdb/GATK_resource_bundle/b37/human_g1k_v37.fasta \ -o P0667Test [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch script (e.g. CONTRA.sh),
#!/bin/bash # ---- this file is called CONTRA.sh --------- module load CONTRA 2>&1 contra.py \ -t /path/to/bed.file \ -s /path/to/test.bam \ -c /path/to/control.bam \ -f /path/to/reference.fasta \ -o /path/to/output.folder
modify the bolded paths, and submit it like so:
sbatch --mem=8GB --cpus-per-task=2 CONTRA.sh
contra.py requires two cpus, one per input BAM file. This example assumes that only 8 GB of memory is required.
If there are multiple sets of alignments, CONTRA can be run as a swarm. Create a swarmfile (e.g. CONTRA.swarm),
contra.py -t /path/to/bed1.file -s /path/to/test1.bam -c /path/to/control1.bam -f /path/to/reference.fasta -o /path/to/output1.folder contra.py -t /path/to/bed2.file -s /path/to/test2.bam -c /path/to/control2.bam -f /path/to/reference.fasta -o /path/to/output2.folder contra.py -t /path/to/bed3.file -s /path/to/test3.bam -c /path/to/control3.bam -f /path/to/reference.fasta -o /path/to/output3.folder contra.py -t /path/to/bed4.file -s /path/to/test4.bam -c /path/to/control4.bam -f /path/to/reference.fasta -o /path/to/output4.folder contra.py -t /path/to/bed5.file -s /path/to/test5.bam -c /path/to/control5.bam -f /path/to/reference.fasta -o /path/to/output5.folder
again after modifying the paths as with the batch job above, and submit it like so:
swarm --module CONTRA -g 8 -t 2 -f CONTRA.swarm