Bakta is a tool for the rapid standardized annotation of bacterial genomes and plasmids from both isolates and MAGs. It provides dbxref-rich, sORF-including and taxon-independent annotations in machine-readable JSON bioinformatics standard file formats for automated downstream analysis.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load bakta [user@cn3144 ~]$ bakta usage: bakta [--db DB] [--min-contig-length MIN_CONTIG_LENGTH] [--prefix PREFIX] [--output OUTPUT] [--force] [--genus GENUS] [--species SPECIES] [--strain STRAIN] [--plasmid PLASMID] [--complete] [--prodigal-tf PRODIGAL_TF] [--translation-table {11,4}] [--gram {+,-,?}] [--locus LOCUS] [--locus-tag LOCUS_TAG] [--keep-contig-headers] [--compliant] [--replicons REPLICONS] [--regions REGIONS] [--proteins PROTEINS] [--meta] [--skip-trna] [--skip-tmrna] [--skip-rrna] [--skip-ncrna] [--skip-ncrna-region] [--skip-crispr] [--skip-cds] [--skip-pseudo] [--skip-sorf] [--skip-gap] [--skip-ori] [--skip-plot] [--help] [--verbose] [--debug] [--threads THREADS] [--tmp-dir TMP_DIR] [--version]bakta: error: the following arguments are required: [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. bakta.sh). For example:
#!/bin/bash set -e module load bakta bakta < bakta.in > bakta.out
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] bakta.sh
Create a swarmfile (e.g. bakta.swarm). For example:
bakta < bakta.in > bakta.out bakta < bakta.in > bakta.out bakta < bakta.in > bakta.out bakta < bakta.in > bakta.out
Submit this job using the swarm command.
swarm -f bakta.swarm [-g #] [-t #] --module baktawhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module bakta | Loads the bakta module for each subjob in the swarm |