Biowulf High Performance Computing at the NIH
GUIDANCE on Biowulf

GUIDANCE is meant to be used for weighting, filtering or masking unreliably aligned positions in sequence alignments before subsequent analysis. For example, align codon sequences (nucleotide sequences that code for proteins) using PAGAN, remove columns with low GUIDANCE scores, and use the remaining alignment to infer positive selection using the branch-site dN/dS test. Other analyses where GUIDANCE filtering could be useful include phylogeny reconstruction, reconstruction of the history of specific insertion and deletion events, inference of recombination events, etc. GUIADNCE2 also provides a set of alternative alignments which can be used when adopting statistical point of view, i.e. performing statistical analyses that rely on many possible alignments that are supported by the data.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load guidance
[user@cn3144 ~]$ guidance.pl --seqFile $GUIDANCE_EXAMPLES/defensins.fas --msaProgram MAFFT --seqType aa --outDir $(pwd)/output
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. guidance.sh). For example:

#!/bin/bash
set -e
module load guidance
guidance.pl --seqFile defensins.fas --msaProgram MAFFT --seqType aa --outDir $(pwd)/output

Submit this job using the Slurm sbatch command.

sbatch guidance.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. guidance.swarm). For example:

guidance.pl --seqFile nucleotide1.fas --msaProgram PRANK --seqType nuc --outDir $(pwd)/output1 --bootstraps 30
guidance.pl --seqFile nucleotide2.fas --msaProgram PRANK --seqType nuc --outDir $(pwd)/output2 --bootstraps 30
guidance.pl --seqFile nucleotide3.fas --msaProgram PRANK --seqType nuc --outDir $(pwd)/output3 --bootstraps 30
guidance.pl --seqFile nucleotide4.fas --msaProgram PRANK --seqType nuc --outDir $(pwd)/output4 --bootstraps 30

Submit this job using the swarm command.

swarm -f guidance.swarm --module guidance
where
--module guidance Loads the guidance module for each subjob in the swarm