MosaicForecast is a machine learning method that leverages read-based phasing and read-level features to accurately detect mosaic SNVs (SNPs, small indels) from NGS data. It builds on existing algorithms to achieve a multifold increase in specificity.


Interactive job
Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive --cpus-per-task=2 --mem=2G
[user@cn3144 ~]$ module load mosaicforecast
[user@cn3144 ~]$
Usage: python bam_dir output_dir ref_fasta input_positions(file format:chr pos-1 pos ref alt sample, sep=\t) min_dp_inforSNPs(int) Umap_mappability(bigWig file,k=24) n_threads_parallel sequencing_file_format(bam/cram)

1. Name of bam files should be "sample.bam" under the bam_dir, and there should be corresponding index files.
2. There should be a fai file under the same dir of the fasta file (samtools faidx input.fa).
3. The "min_dp_inforSNPs" is the minimum depth of coverage of trustworthy neaby het SNPs.
4. Bam file is preferred than cram file, as the program would run much more slowly if using cram format.

[user@cn3144 ~]$ mkdir mosaicforecast_test && cd mosaicforecast
[user@cn3144 ~]$ cp -r ${MOSAIC_TESTDATA:-none}/* .
[user@cn3144 ~]$ ./demo/ test_out \
                     /fdb/GATK_resource_bundle/b37-2.8/human_g1k_v37_decoy.fasta \
                     ./demo/test.input 20 2 bam
[user@cn3144 ~]$ exit
[user@biowulf ~]$

Batch job
Create a batch input file (e.g. For example:

#SBATCH --cpus-per-task=2
#SBATCH --mem=2G
#SBATCH --time=2:00:00
#SBATCH --partition=norm

set -e
module load mosaicforecast
cp -r ${MOSAIC_TESTDATA:-none}/* .
cp -r ${MOSAIC_MODEL:-none}/* .
Prediction.R demo/test.SNP.features models_trained/250xRFmodel_addRMSK_Refine.rds Refine test.SNP.predictions

Submit the job:

Swarm of Jobs
Create a swarmfile (e.g. job.swarm). For example:

       Prediction.R demo/test.SNP.features models_trained/250xRFmodel_addRMSK_Refine.rds Refine SNP.predictions   
       Prediction.R demo/test.DEL.features models_trained/deletions_250x.RF.rds Phase DEL.predictions


Submit this job using the swarm command.

swarm -f job.swarm [-g #] --module mosaicforecast
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
--module Loads the module for each subjob in the swarm