CaVEMan on Biowulf

Single nucleotide variant (SNV) expectation maximisation based mutation calling algorithm aimed at detecting somatic mutations in paired (tumour/normal) cancer samples. Supports both bam and cram format via htslib


Interactive job
Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load caveman

[user@cn3144 ~]$ mkdir /data/$USER/caveman; cd /data/$USER/caveman
Copy sample data to the current folder:
[user@cn3144 ~]$ cp -r $CAVEMAN_DATA . 
This command will create a data folder test_data in the current directory. Processing of the data by cavemen can be performed in five steps described in the documentation and reproduced below.

Step 1:
[user@cn3144 ~]$ caveman setup -t test_data/test_mt.bam -n test_data/test_wt.bam -r test_data/genome.fa.fai -g test_data/ign.test
- files ./caveman.cfg.ini and ./alg_bean will be produced in the current directory.

Step 2:
[user@cn3144 ~]$ caveman split -i 1
- file ./splitList.1 will be produced.

For the further processing, the name of the splitList file should be the same as specified in the file ./caveman.cfg.ini, so one should either to
- edit the file ./caveman.cfg.ini accordinglty, or
- store the contents of the file(s) splitList.* in a file splitList:
[user@cn3144 ~]$ cat ./splitList.* > splitList
Step 3:
[user@cn3144 ~]$ caveman mstep -i 1
Folder ./results/1 will be created.

Step 4:
[user@cn3144 ~]$ caveman merge
- files probs_arr and covs_arr will be created in the current folder.

Step 5:
[user@cn3144 ~]$ caveman estep -i 1  -v 37/GRCh37  -w human
More files will be produced in the folder results/1. Note that specifying the options -v and -w is obligatory for this step.
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Create a batch input file (e.g. For example:

set -e
module load caveman
for i in {1..2}; do
      caveman split -i $i
cat splitList.? > splitList
for i in {1..2}; do
      caveman mstep -i $i
caveman merge
for i in {1..2}; do
      caveman estep -i $i

