SRST2: Short Read Sequence Typing for Bacterial Pathogens
SRST2 is a a read mapping-based tool for rapid molecular typing of bacterial pathogens. It allows fast and accurate detection of genes, alleles and multi-locus sequence types (MLST) from WGS data. SRST2 is highly accurate and outperforms assembly-based methods in terms of both gene detection and allele assignment.
References:
- M.Inouye, H.Dashnow, L.-A.Raven, M.B.Schultz, B.J.Pope,
T.Tomita, J.Zobel and K.E.Holt
SRST2: Rapid genomic surveillance for public health and hospital microbiology labs
Genome Medicine 2014, 6: 90
Documentation
Important Notes
- Module Name: SRST2 (see the modules page for more information)
- Unusual environment variables set
- SRST2_HOME installation directory
- SRST2_BIN executable directory
- SRST2_SRC source code directory
- SRST2 need to be run with python2.7, you can prepare a python 2.7 env with mamba and following command:
mamba create -n py27 python=2.7 mamba activate py27
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive [user@cn3200 ~]$ module load srst2 [+] Loading bowtie 2-2.2.6 [+] Loading srst2 0.2.0 ...Run a sample command:
[user@cn3200 ~]$ getmlst.py --species "Staphylococcus aureus" For SRST2, remember to check what separator is being used in this allele database Looks like --mlst_delimiter '_' >arcC_1 --> --> ('arcC', '_', '1') Suggested srst2 command for use with this MLST database: srst2 --output test --input_pe *.fastq.gz --mlst_db Staphylococcus_aureus.fasta --mlst_definitions saureus.txt --mlst_delimiter '_'The following files will be produced in the current folder:
[user@cn3200 ~]$ tree . . |-- Staphylococcus_aureus.fasta |-- alleles_fasta |-- profiles_csv |-- mlst_data_download_Staphylococcus_epidermidis_None.log 0 directories, 4 filesRun another sample command:
[user@cn3200 ~]$ getmlst.py --species "Staphylococcus epidermidis" --repository_url http://pubmlst.org/data/dbases.xml For SRST2, remember to check what separator is being used in this allele database Looks like --mlst_delimiter '_' >arcC_1 --> --> ('arcC', '_', '1') Suggested srst2 command for use with this MLST database: srst2 --output test --input_pe *.fastq.gz --mlst_db Staphylococcus_epidermidis.fasta --mlst_definitions sepidermidis.txt --mlst_delimiter '_'The output files are as follows:
[user@cn3200 ~]$ tree . . |-- Staphylococcus_epidermidis.fasta |-- alleles_fasta |-- profiles_csv |-- mlst_data_download_Staphylococcus_epidermidis_None.log 0 directories, 4 filesEnd the interactive session:
[user@cn3200 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$