OMA on Biowulf

Quick Links

OMA standalone is a standalone package that can infer orthologs using the OMA algorithm on custom genomes.

It is also possible to export genomes and their homology relations directly from the OMA web-browser and combine them with custom genomes or proteomes.

OMA standalone computes pairwise orthologs and constructs from those two different types of groupings, the OMA Groups and Hierarchical Orthologous Groups (HOGs).

Furthermore, OMA standalone can predict gene function annotations using Gene Ontology terms based on existing annotations from exported genomes, and produces phyletic profiles for OMA Groups and HOGs.

References:

Tonkin-Hill G, MacAlasdair N, Ruis C, Weimann A, Horesh G, Lees JA, Gladstone RA, Lo S, Beaudoin C, Floto RA, Frost SDW, Corander J, Bentley SD, Parkhill J. 2020:

Producing polished prokaryotic pangenomes with the Panaroo pipeline

Genome Biol 21, 180 (2020). https://doi.org/10.1186/s13059-020-02090-4.

Documentation

OMA standalone page

Important Notes

Module Name: OMA (see the modules page for more information)

Interactive job

Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load OMA
[+] Loading OMA  2.6.0

[user@cn3144 ~]$  OMA -h
OMA -h
/usr/local/apps/OMA/2.6.0/OMA.2.6.0/OMA/bin/OMA - runs OMA standalone

/usr/local/apps/OMA/2.6.0/OMA.2.6.0/OMA/bin/OMA [options] [paramfile]

Runs the standalone version of the Orthologous MAtrix (OMA) pipeline
to infer orthologs among complete genomes. A highlevel description
of its algorithm is available here: http://omabrowser.org/oma/about

The all-against-all Smith-Waterman alignment step of OMA requires
a lot of CPU time. OMA standalone can therefore be run in parallel.
If you intend to use OMA standalone on a HPC cluster with a scheduler
such as LSF, PBS Pro, Slurm or SunGridEngine, you should use the
jobarray option of those systems,
e.g. bsub -J "oma[1-500]" /usr/local/apps/OMA/2.6.0/OMA.2.6.0/OMA/bin/OMA (on LSF).
     qsub -t 1-500 /usr/local/apps/OMA/2.6.0/OMA.2.6.0/OMA/bin/OMA (on SunGridEngine)
In case you run OMA on a single computer with several cores, use
the -n option.

Options:
  -n    number of parallel jobs to be started on this computer
  -v            version
  -d     increase debug info to . By default level is set to 1.
  -i            interactive session, do not quit in case of error and at the end
                of the run.
  -s            stop after the AllAll phase. This is the part which is parallelized.
                The option can be useful on big datasets that require lot of
                memory for the later phases of OMA. It allows to stop after the
                parallelized step and restart again a single process with more
                memory.
  -c            stop after database conversion. This option is useful if you
                work with a large dataset and/or the filesystem you use is
                slow.
  -W      maximum amount of Wall-clock time (in secs) that the job should
                run before terminating in a clean way. This option has only an
                effect in the all-against-all phase. If the job terminates
                because it reaches the time limit, it quits with the exit
                code 99.
  -p            copy the default parameter file to the current directory. This
                is useful if want to analyse a new dataset and previously
                installed OmaStandalone.
  -h/?          this help

paramfile       path to the parameter file. it defaults to ./parameters.drw


EXIT
   0            normal exit
   1            a general error (i.e. configuration problem) occured
  99            reached timelimit (provided with -W flag)


[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job

Most jobs should be run as batch jobs.

Create a batch input file (e.g. parmed.sh). For example:

#!/bin/bash
set -e
module load OMA
OMA .....

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#] oma.sh