MetaWRAP is a modular pipeline for shotgun metagenomic data analysis. It deploys state-of-the-art software to handle metagenomic data processing starting from raw sequencing reads and ending in metagenomic bins and their analysis. It includes hybrid algorithms that leverage the strengths of a variety of software to extract and refine high-quality bins from metagenomic data through bin consolidation and reassembly.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=4g --gres=lscratch:20 -c8 [user@cn0911 ~]$module load metawrap [+] Loading blast 2.14.0+ ... [+] Loading bwa 0.7.17 on cn4272 [+] Loading bowtie 2-2.3.5 [+] Loading kraken 1.1 [+] Loading kronatools 2.8.1 on cn4272 [+] Loading perl 5.34.0 on cn4272 [+] Loading samtools 1.9 ... [+] Loading singularity 4.0.1 on cn4272 [+] Loading salmon 1.7.0 [+] Loading prokka 1.14.6 [+] Loading metabat 2.15 on cn4272 [+] Loading quast 5.2.0 on cn4272 [+] Loading cutadapt 4.4 [+] Loading eigen 3.4.0-1092574b ... [+] Loading trimgalore 0.6.6 ... [+] Loading fastqc 0.11.8 [+] Loading checkm2 1.0.2 [+] Loading megahit, version 1.2.9... [+] Loading spades 3.15.5 [+] Loading gcc 11.3.0 ... [+] Loading HDF5 1.12.2 [+] Loading netcdf 4.9.0 [+] Loading openmpi/4.1.3/gcc-11.3.0 ... [+] Loading pandoc 2.18 on cn4272 [+] Loading pcre2 10.40 [+] Loading R 4.3.0 [+] Loading metawrap 1.3.2 ... [user@cn0911 ~]$mkdir /data/$USER/metawrap && cd /data/$USER/metawrap [user@cn0911 ~]$cp -r $MW_SRC/* .Download sample data, unzip the data and place them into folder RAW_READS:
[user@cn0911 ~]$cp $MW_DATA/* . [user@cn0911 ~]$gunzip *.gz [user@cn0911 ~]$mkdir RAW_READS && mv *fastq RAW_READS [user@cn0911 ~]$ls RAW_READS ERR011347_1.fastq ERR011347_2.fastq ERR011348_1.fastq ERR011348_2.fastq ERR011349_1.fastq ERR011349_2.fastqPerform the analysis steps that are described in more details in the Usage Tutorial:
[user@cn0911 ~]$mkdir READ_QC [user@cn0911 ~]$metawrap read_qc -1 RAW_READS/ERR011347_1.fastq -2 RAW_READS/ERR011347_2.fastq -t 24 -o READ_QC/ERR011347 [user@cn0911 ~]$metawrap read_qc -1 RAW_READS/ERR011348_1.fastq -2 RAW_READS/ERR011348_2.fastq -t 24 -o READ_QC/ERR011348 [user@cn0911 ~]$metawrap read_qc -1 RAW_READS/ERR011349_1.fastq -2 RAW_READS/ERR011349_2.fastq -t 24 -o READ_QC/ERR011349 [user@cn0911 ~]$mkdir CLEAN_READS [user@cn0911 ~]$for i in READ_QC/*; do b=${i#*/} mv ${i}/final_pure_reads_1.fastq CLEAN_READS/${b}_1.fastq mv ${i}/final_pure_reads_2.fastq CLEAN_READS/${b}_2.fastq doneStep 2: Assemble the metagenomes with the metaWRAP-Assembly module:
[user@cn0911 ~]$cat CLEAN_READS/ERR*_1.fastq > CLEAN_READS/ALL_READS_1.fastq [user@cn0911 ~]$cat CLEAN_READS/ERR*_2.fastq > CLEAN_READS/ALL_READS_2.fastq [user@cn0911 ~]$metawrap assembly -1 CLEAN_READS/ALL_READS_1.fastq \ -2 CLEAN_READS/ALL_READS_2.fastq \ -m 200 -t 96 --use-metaspades -o ASSEMBLYStep 3: Run Kraken module on both reads and the assembly:
[user@cn0911 ~]$metawrap kraken -o KRAKEN -t 96 -s 1000000 CLEAN_READS/ERR*fastq ASSEMBLY/final_assembly.fastaStep 4: Bin the co-assembly with three different algorithms with the Binning module:
[user@cn0911 ~]$metawrap binning -o INITIAL_BINNING -t 96 -a ASSEMBLY/final_assembly.fasta \ --metabat2 --maxbin2 --concoct CLEAN_READS/ERR*fastqStep 5: Consolidate bin sets with the Bin_refinement module:
[user@cn0911 ~]$metawrap bin_refinement -o BIN_REFINEMENT -t 96 \ -A INITIAL_BINNING/metabat2_bins/ \ -B INITIAL_BINNING/maxbin2_bins/ \ -C INITIAL_BINNING/concoct_bins/ \ -c 50 -x 10Step 6: Visualize the community and the extracted bins with the Blobology module:
[user@cn0911 ~]$metawrap blobology -a ASSEMBLY/final_assembly.fasta \ -t 96 -o BLOBOLOGY \ --bins BIN_REFINEMENT/metawrap_50_10_bins CLEAN_READS/ERR*fastqStep 7: Find the abundaces of the draft genomes (bins) across the samples:
[user@cn0911 ~]$metawrap quant_bins -b BIN_REFINEMENT/metawrap_50_10_bins \ -o QUANT_BINS \ -a ASSEMBLY/final_assembly.fasta CLEAN_READS/ERR*fastqStep 8: Re-assemble the consolidated bin set with the Reassemble_bins module:
[user@cn0911 ~]$metawrap reassemble_bins -o BIN_REASSEMBLY \ -1 CLEAN_READS/ALL_READS_1.fastq \ -2 CLEAN_READS/ALL_READS_2.fastq \ -t 96 -m 800 -c 50 -x 10 \ -b BIN_REFINEMENT/metawrap_50_10_binsStep 9: Determine the taxonomy of each bin with the Classify_bins module:
[user@cn0911 ~]$metawrap classify_bins -b BIN_REASSEMBLY/reassembled_bins \ -o BIN_CLASSIFICATION -t 48Step 10: Functionally annotate bins with the Annotate_bins module
[user@cn0911 ~]$metaWRAP annotate_bins -o FUNCT_ANNOT -t 96 -b BIN_REASSEMBLY/reassembled_bins/End the interactive session:
[user@cn0911 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$