MToolBox is a highly automated bioinformatics pipeline to reconstruct and analyze human mitochondrial DNA from high throughput sequencing data. It includes an updated computational strategy to assemble mitochondrial genomes from whole exome and/or genome sequencing and an improved fragment-classify tool for haplogroup assignment, functional and prioritization analysis of mitochondrial variants. It also provides pathogenicity scores, profiles of genome variability and disease-associations for mitochondrial variants and a Variant Call Format file featuring allele-specific heteroplasmy.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load mtoolbox [user@cn3144 ~]$ cd /path/to/mtoolbox/input/dir [user@cn3144 ~]$ MToolbox.sh -i input file [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. mtoolbox.sh). For example:
#!/bin/bash set -e module load mtoolbox cd MToolbox/input/files/dir MToolBox.sh -i fastq -I -M -r RCRS
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] mtoolbox.sh
Create a swarmfile (e.g. mtoolbox.swarm). For example:
cd /MToolbox/input/dir1 && MToolbox.sh -i <input_format> -r <reference_sequence> -m "<mapExome_options>" -a "<assembleMTgenome_options>" -c "<mt-classifier_options>" cd /MToolbox/input/dir2 && Mtoolbox.sh -i <input_format> -r <reference_sequence> -m "<mapExome_options>" -a "<assembleMTgenome_options>" -c "<mt-classifier_options>" cd /MToolbox/input/dir3 && MToolbox.sh -i <input_format> -r <reference_sequence> -m "<mapExome_options>" -a "<assembleMTgenome_options>" -c "<mt-classifier_options>" cd /MToolbox/input/dir4 && MToolbox.sh -i <input_format> -r <reference_sequence> -m "<mapExome_options>" -a "<assembleMTgenome_options>" -c "<mt-classifier_options>"
Submit this job using the swarm command.
swarm -f mtoolbox.swarm [-g #] [-t #] --module mtoolboxwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module mtoolbox | Loads the MToolBox module for each subjob in the swarm |