amr on Biowulf

AMRFinderPlus - Identify AMR genes and point mutations, and virulence and stress resistance genes in assembled bacterial nucleotide and protein sequence.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive -c 4 --mem=10G --gres=lscratch:10
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144 ~]$ module load amr
[user@cn3144 ~]$ export $TMPDIR=/lscratch/$SLURM_JOB_ID
[user@cn3144 ~]$ amrfinder --help 
Identify AMR and virulence genes in proteins and/or contigs and print a report

DOCUMENTATION
    See https://github.com/ncbi/amr/wiki for full documentation

UPDATES
    Subscribe to the amrfinder-announce mailing list for database and software update notifications:
    https://www.ncbi.nlm.nih.gov/mailman/listinfo/amrfinder-announce

USAGE:   amrfinder [--update] [--force_update] [--protein PROT_FASTA] [--nucleotide NUC_FASTA] [--gff GFF_FILE] [--annotation_format ANNOTATION_FORMAT] [--database DATABASE_DIR] [--database_version] [--ident_min MIN_IDENT] [--coverage_min MIN_COV] [--organism ORGANISM] [--list_organisms] [--translation_table TRANSLATION_TABLE] [--plus] [--report_common] [--report_all_equal] [--name NAME] [--print_node] [--mutation_all MUT_ALL_FILE] [--output OUTPUT_FILE] [--protein_output PROT_FASTA_OUT] [--nucleotide_output NUC_FASTA_OUT] [--nucleotide_flank5_output NUC_FLANK5_FASTA_OUT] [--nucleotide_flank5_size NUC_FLANK5_SIZE] [--blast_bin BLAST_DIR] [--hmmer_bin HMMER_DIR] [--quiet] [--pgap] [--gpipe_org] [--parm PARM] [--threads THREADS] [--debug] [--log LOG]
HELP:    amrfinder --help or amrfinder -h
VERSION: amrfinder --version

[user@cn3144 ~]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf ~]$

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. amr.sh). For example:

#!/bin/bash
set -e
module load amr
export $TMPDIR=/lscratch/$SLURM_JOB_ID
# Protein AMRFinder with no genomic coordinates
amrfinder -p test_prot.fa

# Translated nucleotide AMRFinder (will not use HMMs)
amrfinder -n test_dna.fa

# Protein AMRFinder using GFF to get genomic coordinates and 'plus' genes
amrfinder -p test_prot.fa -g test_prot.gff --plus

# Protein AMRFinder with Escherichia protein point mutations
amrfinder -p test_prot.fa -O Escherichia

# Full AMRFinderPlus search combining results
amrfinder -p test_prot.fa -g test_prot.gff -n test_dna.fa -O Escherichia --plus

Submit this job using the Slurm sbatch command.

sbatch --cpus-per-task=6 --mem=10G --gres=lscratch:30  amr.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.

Create a swarmfile (e.g. amr.swarm). For example:

amr < amr.in > amr.out
amr < amr.in > amr.out
amr < amr.in > amr.out
amr < amr.in > amr.out

Submit this job using the swarm command.

swarm -f amr.swarm [-g #] [-t #] --module amr
where
-g # Number of Gigabytes of memory required for each process (1 line in the swarm command file)
-t # Number of threads/CPUs required for each process (1 line in the swarm command file).
--module amr Loads the amr module for each subjob in the swarm