MINTIE: identifying novel, rare transcriptional variants in cancer RNA-seq data

MINTIE is a tool for identifying novel, rare transcriptional variants in cancer RNA-seq data. MINTIE detects gene fusions, transcribed structural variants, novel splice variants and complex variants, and annotates all novel transcriptional variants.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program. Sample session:

[user@biowulf]$ sinteractive --mem=16g -c 16 --gres=lscratch:20
[user@cn3335 ~]$module load MINTIE 
[+] Loading java 12.0.1  ...
[+] Loading MINTIE 0.3.7 ...
Set up references:
[user@biowulf]$ mkdir -p /data/$USER/MINTIE && cd /data/$USER/MINTIE 
[user@biowulf]$ git clone https://github.com/Oshlack/MINTIE
[user@biowulf]$ export MINTIE_HOME=$PWD/MINTIE
Add the line "useLegacyTorqueJobPolling=true" to the configuration file:
[user@biowulf]$ sed -i '16i useLegacyTorqueJobPolling=true' MINTIE/bpipe.config
Enter the singularity shell and set up references from there as described below (this will only need to be done once). Then exit the shell:
[user@biowulf]$ shell
Singularity> MINTIE/setup_references_hg38.sh
...
gzip: chess3.0.gtf already exists; do you wish to overwrite (y or n)? y
...
Singularity> python MINTIE/util/make_tx2gene_lookup.py ref/chess3.0.gtf > ref/tx2gene.txt
Singularity> python MINTIE/util/make_exon_reference.py ref/chess3.0.gtf  
Singularity> echo -e "ref/chess3.0.info" >  ref/ann_info.success 
Singularity> ANN_INFO='ann_info="/data/'$USER'/MINTIE/ref/ann_info.success"' 
Singularity> sed -i 's|ann_info=""|'"$ANN_INFO"'|g' references.groovy 
Singularity> TX2GENE='tx2gene="/data/'$USER'/MINTIE/ref/tx2gene.txt"' 
Singularity> sed -i 's|tx2gene=""|'"$TX2GENE"'|g' references.groovy 
Singularity> mv references.groovy MINTIE
Singularity> exit
As a result, a folder ref and a new file MINTIE/references.groovy will be created. The folder will contain the references, and the file - full paths to them. Make sure that all the fields in the file MINTIE/references.groovy are filled:
[user@cn3335 ~]$ cat MINTIE/references.groovy 
// Path to references used by the MINTIE pipeline
gmap_refdir="/data/user/MINTIE/ref/"
genome_fasta="/data/user/MINTIE/ref/hg38.fa"
tx_annotation="/data/user/MINTIE/ref/chess3.0.gtf"
trans_fasta="/data/user/MINTIE/ref/chess3.0.fa"
ann_info="/data/user/MINTIE/ref/ann_info.success"
tx2gene="/data/user/MINTIE/ref/tx2gene.txt"
gmap_refdir="/data/user/MINTIE/ref"
gmap_genome="gmap_genome"
Now you are ready to run mintie. Set up testing data:
[user@cn3335 ~]$MINTIE/mintie -t
and run mintie on the these data:
[user@cn3335 ~]$ MINTIE/mintie -w -p test_params.txt cases/*.fastq.gz controls/*.fastq.gz   
????????????????????????????????????????????????????????????????????????????????????????????????????
|                              Starting Pipeline at 2022-12-02 13:05                               |
????????????????????????????????????????????????????????????????????????????????????????????????????

================================ Stage fastq_dedupe (allvars-case) =================================
...
==================================== Stage trim (allvars-case) =====================================
...
================================== Stage assemble (allvars-case) ===================================
...
============================= Stage create_salmon_index (allvars-case) =============================
...
================================= Stage run_salmon (allvars-case) ==================================
...
================================ Stage run_salmon (allvars-control) ================================
...
=========================== Stage create_ec_count_matrix (allvars-case) ============================
...
=================================== Stage run_de (allvars-case) ====================================
...
========================== Stage filter_on_significant_ecs (allvars-case) ==========================
...
======================== Stage align_contigs_against_genome (allvars-case) =========================
...
============================= Stage sort_and_index_bam (allvars-case) ==============================
...
============================== Stage annotate_contigs (allvars-case) ===============================
...
=============================== Stage refine_contigs (allvars-case) ================================
...
================================ Stage calculate_VAF (allvars-case) ================================
...
================================ Stage post_process (allvars-case) =================================
index file allvars-case/novel_contigs.fasta.fai not found, generating...

======================================== Pipeline Succeeded ========================================
13:06:11 MSG:  Finished at Fri Dec 02 13:06:11 UTC 2022
13:06:11 MSG:  Outputs are:
                allvars-case/vaf_estimates.txt
                allvars-case/allvars-case_results.tsv
                allvars-case/novel_contigs.bam
                allvars-case/novel_contigs.vcf
                allvars-case/all_fasta_index/allvars-case.fasta (pre-existing)
                ... 4 more ...
[user@cn3335 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$