toga:Tool to infer Orthologs from Genome Alignments
TOGA is a new method that integrates gene annotation, inferring orthologs and classifying genes as intact or lost. TOGA implements a novel machine learning based paradigm to infer orthologous genes between related species and to accurately distinguish orthologs from paralogs or processed pseudogenes. This tutorial explains how to get started using TOGA. It shows how to install and execute TOGA, and how to handle possible issues that may occur.
References:
-
Kirilenko BM, Munegowda C, Osipova E, Jebb D, Sharma V, Blumer M, Morales AE, Ahmed AW, Kontopoulos DG, Hilgers L, Lindblad-Toh K, Karlsson EK, Zoonomia Consortium, Hiller M.
Integrating gene annotation with orthology inference at scale.
PubMed Science, 2023
Documentation
Important Notes
- Module Name: toga (see the modules page for more information)
- Unusual environment variables set
- TOGA_GENOME toga reference directory
- TOGA_CONFIG toga config directory
- TOGA_SUPPLY toga supplymentary directory
- TOGA_TEST_DATA sample data for running toga
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=4g --gres=lscratch:10 [user@cn3144 ~]$ module load toga [+] Loading java 17.0.3.1 ... [+] Loading singularity 3.10.5 on cn4292 [+] Loading nextflow 22.10.2 [+] Loading toga 1.1.2 [user@cn3144 ]$ cp -r $TOGA_SUPPLY/* .run testing data
[user@cn3144 ]$ toga.py \ $TOGA_TEST_DATA/hg38.mm10.chr11.chain \ $TOGA_TEST_DATA/hg38.genCode27.chr11.bed \ $TOGA_GENOME/hg38.2bit \ $TOGA_GENOME/mm10.2bit \ --kt --pn test -i \ $TOGA_SUPPLY/hg38.wgEncodeGencodeCompV34.isoforms.txt \ --nc $TOGA_CONFIG \ --cb 3,5 --cjn 500 --u12 supply/hg38.U12sites.tsv --ms \ --nextflow_dir ./
Batch job
Most jobs should be run as batch jobs.
Create a batch input file (e.g. toga.sh). For example:
#! /bin/bash module load toga || exit 1 cp -r $TOGA_SUPPLY/* . toga.py \ $TOGA_TEST_DATA/hg38.mm10.chr11.chain \ $TOGA_TEST_DATA/hg38.genCode27.chr11.bed \ $TOGA_GENOME/hg38.2bit \ $TOGA_GENOME/mm10.2bit \ --kt --pn test -i \ $TOGA_SUPPLY/hg38.wgEncodeGencodeCompV34.isoforms.txt \ --nc $TOGA_CONFIG \ --cb 3,5 --cjn 500 --u12 supply/hg38.U12sites.tsv --ms \ --nextflow_dir ./
Submit this job using the Slurm sbatch command.
sbatch toga.sh