High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Salmon on Biowulf and Helix
Salmon Logo

Salmon is a tool for quantifying the expression of transcripts using RNA-seq data. It uses new algorithms to provide accurate expression estimates very quickly and while using very little memory. Salmon performs its inference using an expressive and realistic model of RNA-seq data that takes into account the attributes

On Helix

To use salmon on either system, you must load the module.

module load salmon
salmon -h
Running a Single Batch Job on Biowulf

Create a batch input file, run_salmon:

#/bin/bash
# ----- this file is run_salmon -----

module load salmon
cd /data/$USER

The job can be submitted with

sbatch run_salmon

This command will submit the job to 2 cores and 4GB of memory. If you need more memory than the default 4 GB, use

sbatch --mem=#g run_salmon
Running a swarm of salmon jobs on Biowulf

The swarm program is designed to submit a group of commands to the Biowulf cluster. Each command is represented by a single line in the swarm command file that you create, and runs as a separate batch job. See the swarm page for more information.

Create a swarm command file, salmon_swarm. Example:

cd /data/$USER/salmon1; 
cd /data/$USER/salmon2; 
cd /data/$USER/salmon3; 
cd /data/$USER/salmon4; 

Submit this to the batch system with the command:

swarm -f salmon_swarm --module salmon

If each salmon job requires more than the default 4 GB of memory, use

swarm -g # -f salmon_swarm --module salmon

For information on how to monitor your job(s), see Monitoring Jobs.

Running salmon interactively

If you want to run your job interactively, you can allocate a node for interactive use. Once the node is allocated, you can type commands directly on the command-line. Example:

[user@biowulf ~]$ sinteractive
salloc.exe: Pending job allocation 15323416salloc.exe: job 15323416 queued and waiting for resourcessalloc.exe: job 15323416 has been allocated resourcessalloc.exe: Granted job allocation 15323416salloc.exe: Waiting for resource configurationsalloc.exe: Nodes cn1640 are ready for job
[user@cn1640 ~]$ cd /data/$USER/salmon
[user@cn1640 ~]$ module load salmon
[user@cn1640 dir]$ 

If you need more memory than the default 4 GB, use sinteractive --mem=#g

Documentation