High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
humann2 on Biowulf & Helix

Description

From the humann2 home page:

HUMAnN is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads). This process, referred to as functional profiling, aims to describe the metabolic potential of a microbial community and its members. More generally, functional profiling answers the question "What are the microbes in my community-of-interest doing (or capable of doing)?"

There may be multiple versions of humann2 available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail humann2 

To select a module use

module load humann2/[version]

where [version] is the version of choice.

Note that the humann2 module is not allowed to load on helix or the biowulf login node.

humann2 is a multithreaded application. Make sure to match the number of cpus requested with the number of threads.

Environment variables set

References

Documentation

Batch job on Biowulf

Create a batch script similar to the following example:

#! /bin/bash
# this file is humann.batch

module load humann2 || exit 1
cd /lscratch/$SLURM_JOB_ID || exit 1
cp /usr/local/apps/humann2/TEST_DATA/demo.fastq .
mkdir out

humann2 --threads $SLURM_CPUS_PER_TASK \
  --input demo.fastq \
  --output out

Submit to the queue with sbatch:

biowulf$ sbatch --mem=10g --cpus-per-task=4 humann.batch
Swarm of jobs on Biowulf

Create a swarm command file similar to the following example:

# this file is humann.swarm
humann2 --input sample1.sam --output sample1.out
humann2 --input sample2.sam --output sample2.out
humann2 --input sample3.sam --output sample3.out

And submit to the queue with swarm

biowulf$ swarm -f humann.swarm -t4 -g10 --module humann2
Interactive job on Biowulf

Allocate an interactive session with sinteractive and use as described above

biowulf$ sinteractive 
node$ module load humann2
node$ cp -r /usr/local/apps/humann2/TEST_DATA .
node$ humann2 --input TEST_DATA/demo.m8 --output TEST_OUT
...
node$ exit
biowulf$