From the humann home page:
HUMAnN is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads). This process, referred to as functional profiling, aims to describe the metabolic potential of a microbial community and its members. More generally, functional profiling answers the question "What are the microbes in my community-of-interest doing (or capable of doing)?"
$HUMANN_TEST_DATAAllocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive -c12 --mem=24g --gres=lscratch:100
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144 ~]$ module load humann
[user@cn3144 ~]$ cd /lscratch/$SLURM_JOB_ID
[user@cn3144 ~]$ cp -r ${HUMANN_TEST_DATA:-none} demo
[user@cn3144 ~]$ humann --threads $SLURM_CPUS_PER_TASK --input demo/demo.fastq --output demo.out
Creating output directory: /lscratch/46116226/demo.out
Output files will be written to: /lscratch/46116226/demo.out
Running metaphlan ........
Found g__Bacteroides.s__Bacteroides_dorei : 57.96% of mapped reads
Found g__Bacteroides.s__Bacteroides_vulgatus : 42.04% of mapped reads
Total species selected from prescreen: 2
Selected species explain 100.00% of predicted community composition
Creating custom ChocoPhlAn database ........
Running bowtie2-build ........
Running bowtie2 ........
Total bugs from nucleotide alignment: 2
g__Bacteroides.s__Bacteroides_vulgatus: 1274 hits
g__Bacteroides.s__Bacteroides_dorei: 1318 hits
Total gene families from nucleotide alignment: 548
Unaligned reads after nucleotide alignment: 87.6571428571 %
Running diamond ........
Aligning to reference database: uniref90_201901b_full.dmnd
Total bugs after translated alignment: 3
g__Bacteroides.s__Bacteroides_vulgatus: 1274 hits
g__Bacteroides.s__Bacteroides_dorei: 1318 hits
unclassified: 1599 hits
Total gene families after translated alignment: 815
Unaligned reads after translated alignment: 80.6190476190 %
Computing gene families ...
Computing pathways abundance and coverage ...
Output files created:
/lscratch/46116226/demo.out/demo_genefamilies.tsv
/lscratch/46116226/demo.out/demo_pathabundance.tsv
/lscratch/46116226/demo.out/demo_pathcoverage.tsv
[user@cn3144 ~]$ ls -lh demo.out
total 168K
-rw-r--r-- 1 user group 104K Dec 11 12:25 demo_genefamilies.tsv
drwxr-xr-x 2 user group 4.0K Dec 11 12:25 demo_humann_temp
-rw-r--r-- 1 user group 1.4K Dec 11 12:25 demo_pathabundance.tsv
-rw-r--r-- 1 user group 1.3K Dec 11 12:25 demo_pathcoverage.tsv
[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$
Create a batch input file (e.g. humann.sh). For example:
#! /bin/bash module load humann/3.9.0 || exit 1 cd /lscratch/$SLURM_JOB_ID || exit 1 cp $HUMANN_TEST_DATA/demo.fastq . mkdir out # for humann version 2 modules, this command would be humann2 humann --threads $SLURM_CPUS_PER_TASK \ --input demo.fastq \ --output out
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] humann.sh
Create a swarmfile (e.g. humann.swarm). For example:
humann --input sample1.bam --output sample1.out humann --input sample2.bam --output sample2.out humann --input sample3.bam --output sample3.out
Submit this job using the swarm command.
swarm -f humann.swarm -g 10 -t 4 --module humann/3.9.0where
| -g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
| -t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
| --module humann/XXXX | Loads the humann module for each subjob in the swarm |