Bracken is a companion program to Kraken 1, KrakenUniq, or Kraken 2 While Kraken classifies reads to multiple levels in the taxonomic tree, Bracken allows estimation of abundance at a single level using those classifications (e.g. Bracken can estimate abundance of species within a sample).
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=64g --gres=gpu:p100:1,lscratch:300 -c16 [user@cn3104 ~]$ module load bracken [+] Loading kraken 2.1.2 [+] Loading bracken 2.8 [user@cn3104 ~]$ bracken -h /usr/local/apps/bracken/2.8/bin/bracken: illegal option -- h Usage: bracken -d MY_DB -i INPUT -o OUTPUT -w OUTREPORT -r READ_LEN -l LEVEL -t THRESHOLD MY_DB location of Kraken database INPUT Kraken REPORT file to use for abundance estimation OUTPUT file name for Bracken default output OUTREPORT New Kraken REPORT output file with Bracken read estimates READ_LEN read length to get all classifications for (default: 100) LEVEL level to estimate abundance at [options: D,P,C,O,F,G,S,S1,etc] (default: S) THRESHOLD number of reads required PRIOR to abundance estimation to perform reestimation (default: 0) [user@cn3104 ~]$ bracken-build -h /usr/local/apps/bracken/2.8/bin/bracken-build: illegal option -- h Usage: bracken_build -k KMER_LEN -l READ_LEN -d MY_DB -x K_INSTALLATION -t THREADS KMER_LEN kmer length used to build the kraken database (default: 35) THREADS the number of threads to use when running kraken classification and the bracken scripts READ_LEN read length to get all classifications for (default: 100) MY_DB location of Kraken database K_INSTALLATION location of the installed kraken/kraken-build scripts (default assumes scripts can be run from the user path) [user@cn3104 ~]$ cd /lscratch/$SLURM_JOB_ID [user@cn3104 ~]$ mkdir fa && ca fa [user@cn3104 ~]$ wget --quiet --input-file $KRAKEN_TEST_DATA/genomes_for_custom_db.urls [user@cn3104 ~]$ ls GCA_000006745.1_ASM674v1_genomic.fna.gz GCA_000006885.1_ASM688v1_genomic.fna.gz GCA_000007045.1_ASM704v1_genomic.fna.gz GCA_000007825.1_ASM782v1_genomic.fna.gz GCA_000008005.1_ASM800v1_genomic.fna.gz GCA_000009005.1_ASM900v1_genomic.fna.gz GCA_000009585.1_ASM958v1_genomic.fna.gz ... [user@cn3104 ~]$ gunzip *.gz [user@cn3104 ~]$ ls GCA_000006745.1_ASM674v1_genomic.fna GCA_000006885.1_ASM688v1_genomic.fna GCA_000007045.1_ASM704v1_genomic.fna GCA_000007825.1_ASM782v1_genomic.fna GCA_000008005.1_ASM800v1_genomic.fna GCA_000009005.1_ASM900v1_genomic.fna ... [user@cn3104 ~]$ kraken2-build --download-taxonomy --db custom_db -t 16 Downloading nucleotide gb accession to taxon map... done. Downloading nucleotide wgs accession to taxon map... done. Downloaded accession to taxon map(s) Downloading taxonomy tree data... done. Uncompressing taxonomy data...A folder custom_db with subfolder taxonomy has been be created.
[user@cn3104 ~]$ ls -t custom_db GCA_002000745.1_ASM200074v1_genomic.fna GCA_001641045.1_ASM164104v1_genomic.fna GCA_001518875.1_ASM151887v1_genomic.fna GCA_001518775.1_ASM151877v1_genomic.fna GCA_001975045.1_ASM197504v1_genomic.fna ... [user@cn3104 ~]$ cd .. [user@cn3104 ~]$ for f in fa/*.fna; do kraken2-build --add-to-library $f --db custom_db; done Masking low-complexity regions of new file... done. Added "fa/GCA_000006745.1_ASM674v1_genomic.fna" to library (custom_db) Masking low-complexity regions of new file... done. Added "fa/GCA_000006885.1_ASM688v1_genomic.fna" to library (custom_db) Masking low-complexity regions of new file... done. ...The folder custom_db with subfolder library has been be created..
[user@cn3104 ~]$ cd custom_db [user@cn3104 ~]$ ln -s ../fa/custom_db/taxonomy [user@cn3104 ~]$ cd .. [user@cn3104 ~]$ kraken2-build --download-library bacteria --db custom_db -t 16 Step 1/2: Performing rsync file transfer of requested files Rsync file transfer complete. Step 2/2: Assigning taxonomic IDs to sequences ... Processed 38595 projects (91046 sequences, 161.35 Gbp)... done. All files processed, cleaning up extra sequence files... done, library complete. Masking low-complexity regions of downloaded library... [user@cn3104 ~]$ bracken-build -d custom_db -t 10 -k 50 -l 500 -x ${KRAKEN_DB} -t 16Exit the application:
[user@cn3104 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$