From the humann home page:
HUMAnN is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads). This process, referred to as functional profiling, aims to describe the metabolic potential of a microbial community and its members. More generally, functional profiling answers the question "What are the microbes in my community-of-interest doing (or capable of doing)?"
$HUMANN_TEST_DATA
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive -c12 --mem=24g --gres=lscratch:100 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load humann [user@cn3144 ~]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144 ~]$ cp -r ${HUMANN_TEST_DATA:-none} demo [user@cn3144 ~]$ humann --threads $SLURM_CPUS_PER_TASK --input demo/demo.fastq --output demo.out Creating output directory: /lscratch/46116226/demo.out Output files will be written to: /lscratch/46116226/demo.out Running metaphlan ........ Found g__Bacteroides.s__Bacteroides_dorei : 57.96% of mapped reads Found g__Bacteroides.s__Bacteroides_vulgatus : 42.04% of mapped reads Total species selected from prescreen: 2 Selected species explain 100.00% of predicted community composition Creating custom ChocoPhlAn database ........ Running bowtie2-build ........ Running bowtie2 ........ Total bugs from nucleotide alignment: 2 g__Bacteroides.s__Bacteroides_vulgatus: 1274 hits g__Bacteroides.s__Bacteroides_dorei: 1318 hits Total gene families from nucleotide alignment: 548 Unaligned reads after nucleotide alignment: 87.6571428571 % Running diamond ........ Aligning to reference database: uniref90_201901b_full.dmnd Total bugs after translated alignment: 3 g__Bacteroides.s__Bacteroides_vulgatus: 1274 hits g__Bacteroides.s__Bacteroides_dorei: 1318 hits unclassified: 1599 hits Total gene families after translated alignment: 815 Unaligned reads after translated alignment: 80.6190476190 % Computing gene families ... Computing pathways abundance and coverage ... Output files created: /lscratch/46116226/demo.out/demo_genefamilies.tsv /lscratch/46116226/demo.out/demo_pathabundance.tsv /lscratch/46116226/demo.out/demo_pathcoverage.tsv [user@cn3144 ~]$ ls -lh demo.out total 168K -rw-r--r-- 1 user group 104K Dec 11 12:25 demo_genefamilies.tsv drwxr-xr-x 2 user group 4.0K Dec 11 12:25 demo_humann_temp -rw-r--r-- 1 user group 1.4K Dec 11 12:25 demo_pathabundance.tsv -rw-r--r-- 1 user group 1.3K Dec 11 12:25 demo_pathcoverage.tsv [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. humann.sh). For example:
#! /bin/bash module load humann/3.9.0 || exit 1 cd /lscratch/$SLURM_JOB_ID || exit 1 cp $HUMANN_TEST_DATA/demo.fastq . mkdir out # for humann version 2 modules, this command would be humann2 humann --threads $SLURM_CPUS_PER_TASK \ --input demo.fastq \ --output out
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] humann.sh
Create a swarmfile (e.g. humann.swarm). For example:
humann --input sample1.bam --output sample1.out humann --input sample2.bam --output sample2.out humann --input sample3.bam --output sample3.out
Submit this job using the swarm command.
swarm -f humann.swarm -g 10 -t 4 --module humann/3.9.0where
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module humann/XXXX | Loads the humann module for each subjob in the swarm |