OMA standalone is a standalone package that can infer orthologs using the OMA algorithm on custom genomes.
It is also possible to export genomes and their homology relations directly from the OMA web-browser and combine them with custom genomes or proteomes.
OMA standalone computes pairwise orthologs and constructs from those two different types of groupings, the OMA Groups and Hierarchical Orthologous Groups (HOGs).
Furthermore, OMA standalone can predict gene function annotations using Gene Ontology terms based on existing annotations from exported genomes, and produces phyletic profiles for OMA Groups and HOGs.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load OMA [+] Loading OMA 2.6.0 [user@cn3144 ~]$ OMA -h OMA -h /usr/local/apps/OMA/2.6.0/OMA.2.6.0/OMA/bin/OMA - runs OMA standalone /usr/local/apps/OMA/2.6.0/OMA.2.6.0/OMA/bin/OMA [options] [paramfile] Runs the standalone version of the Orthologous MAtrix (OMA) pipeline to infer orthologs among complete genomes. A highlevel description of its algorithm is available here: http://omabrowser.org/oma/about The all-against-all Smith-Waterman alignment step of OMA requires a lot of CPU time. OMA standalone can therefore be run in parallel. If you intend to use OMA standalone on a HPC cluster with a scheduler such as LSF, PBS Pro, Slurm or SunGridEngine, you should use the jobarray option of those systems, e.g. bsub -J "oma[1-500]" /usr/local/apps/OMA/2.6.0/OMA.2.6.0/OMA/bin/OMA (on LSF). qsub -t 1-500 /usr/local/apps/OMA/2.6.0/OMA.2.6.0/OMA/bin/OMA (on SunGridEngine) In case you run OMA on a single computer with several cores, use the -n option. Options: -nnumber of parallel jobs to be started on this computer -v version -d increase debug info to . By default level is set to 1. -i interactive session, do not quit in case of error and at the end of the run. -s stop after the AllAll phase. This is the part which is parallelized. The option can be useful on big datasets that require lot of memory for the later phases of OMA. It allows to stop after the parallelized step and restart again a single process with more memory. -c stop after database conversion. This option is useful if you work with a large dataset and/or the filesystem you use is slow. -W maximum amount of Wall-clock time (in secs) that the job should run before terminating in a clean way. This option has only an effect in the all-against-all phase. If the job terminates because it reaches the time limit, it quits with the exit code 99. -p copy the default parameter file to the current directory. This is useful if want to analyse a new dataset and previously installed OmaStandalone. -h/? this help paramfile path to the parameter file. it defaults to ./parameters.drw EXIT 0 normal exit 1 a general error (i.e. configuration problem) occured 99 reached timelimit (provided with -W flag) [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. parmed.sh). For example:
#!/bin/bash set -e module load OMA OMA .....
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] oma.sh