Biowulf High Performance Computing at the NIH
MEGA on Biowulf

MEGA, or Molecular Evolutionary Genetic Analysis, is a suite of tools that help explore and analyze DNA and protein sequences for evolutionary and population genetics. These include methods for multiple sequence alignment, tree inference, time trees, models of evolution, diversity metrics, selection metrics, ancestral state reconstruction, and evolutionary rates.

The application can be run both as a GUI (command mega) or a command line interface (command megacc). To run the GUI, we recommend connecting to Biowulf using NoMachine. The GUI can be used to run analyses and additionally create analysis options files (.mao files) that can be passed to the command-line application. This can be helpful for running large analyses with sbatch or swarm.


Important Notes

This application requires a graphical connection using NX to run the GUI version

The Blast and Genbank features in MEGA GUI will not work on Biowulf since compute nodes are behind a firewall and MEGA is not able to use proxies. We recommend installing the GUI on your desktop to access these features. You can save a MEGA data session file and transfer it Biowulf to reopen in MEGA in an sinteractive session

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load MEGA

[user@cn3144 ~]$ mega

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. For example:

set -e
module load MEGA
megacc -a analysis.mao -d data.meg

Submit this job using the Slurm sbatch command.

sbatch [--cpus-per-task=#] [--mem=#]
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. MEGA.swarm). For example:

megacc -a analysis1.mao -d data1.meg
megacc -a analysis2.mao -d data1.meg
megacc -a analysis1.mao -d data2.meg
megacc -a analysis2.mao -d data2.meg

Submit this job using the swarm command.

swarm -f MEGA.swarm [-g #] [-t #] --module MEGA
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module MEGA Loads the MEGA module for each subjob in the swarm