High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Oases on Biowulf & Helix


Oases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly. It was developed by Marcel Schulz (MPI for Molecular Genomics) and Daniel Zerbino (previously at the European Bioinformatics Institute (EMBL-EBI), now at UC Santa Cruz).

Oases uploads a preliminary assembly produced by Velvet, and clusters the contigs into small groups, called loci. It then exploits the paired-end read and long read information, when available, to construct transcript isoforms.

There may be multiple versions available on our systems. An easy way of selecting the version is to use modules. To see the modules available, type

module avail oases 

To select a module use

module load oases/[version]

where [version] is the version of choice.

Environment variables set


Interactive Job on Biowulf

Allocate an interactive session with sinteractive and use as described below

biowulf$ sinteractive --mem=20g
salloc.exe: Pending job allocation 38978697
salloc.exe: Nodes cn2273 are ready for job
node$ module load oases
[+] Loading oases
node$ oases directory -ins_length 500
node$ exit


Batch job on Biowulf

Create a batch script similar to the following example:

#! /bin/bash
# this file is file.batch

module load oases || exit 1
cd /data/$USER
oases directory -ins_length 500

Submit to the queue with sbatch:

biowulf$ sbatch file.batch


Swarm of Jobs on Biowulf

Create a swarmfile (e.g. script.swarm). For example:

# this file is called script.swarm
cd dir1;oases command 1;oases command 2
cd dir2;oases command 1;oases command 2
cd dir3;oases command 1;oases command 2

Submit this job using the swarm command.

swarm -f script.swarm --module oases

For more information regarding swarm: https://hpc.nih.gov/apps/swarm.html#usage