StrainGE: Strain-level Genome Exploration
StrainGE is a set of tools to analyse the within-species strain diversity in bacterial populations.
It consists of two main components:
1) StrainGST: Strain Genome Search tool, a tool to find close reference genomes for strains present in a sample and
2) StrainGR: Strain Genome Recovery, a tool to perform strain-aware variant calling at low coverages.
References:
- Lucas R. van Dijk, Bruce J. Walker, Timothy J. Straub, Colin J. Worby, Alexandra Grote,
Henry L. Schreiber IV, Christine Anyansi, Amy J. Pickering, Scott J. Hultgren, Abigail L. Manson,
Thomas Abeel and Ashlee M. Earl
StrainGE: a toolkit to track and characterize low abundance strains in complex microbial communities
Genome biology, 23 (2022), article #74 https://doi.org/10.1186/s13059-022-02630-0
Documentation
Important Notes
- Module Name: strainge (see the modules page for more information)
- Unusual environment variables set
- SGE_HOME installation directory
- SGE_BIN executable directory
- SGE_DATA sample data directory
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive [user@cn3144 ~]$ module load strange [+] Loading singularity 4.0.1 on cn3144 [+] Loading strainge 1.3.9 [user@cn3144 ~]$ strainge 2024-05-29 11:11:45,581 - WARNING:root:DEPRECATION WARNING - the `strainge` CLI program is deprecated, please use `straingst` or `straingr` instead. usage: strainge [-h] [--version] [-v] {kmerize,kmersim,cluster,createdb,search,call,view,compare,tree,stats,plot} ... ================================ StrainGE: Strain Genome Explorer ================================ A set of tools for strain-level analysis in mixed metagenomic samples --------------------------------------------------------------------- Version: 1.3.9 DEPRECATED: please use `straingst` or `straingr` instead. optional arguments: -h, --help show this help message and exit --version show program's version number and exit -v, --verbose Increase verbosity level, number of levels: 0, 1, 2 Subcommands: {kmerize,kmersim,cluster,createdb,search,call,view,compare,tree,stats,plot} kmerize K-merize a given reference sequence or a sample read dataset. kmersim Compare k-mer sets with each other. Both all-vs-all and one-vs-all is supported. cluster Group k-mer sets that are very similar to each other together. createdb Create pan-genome database in HDF5 format from a list of k-merized strains. search StrainGST: strain genome search tool. Identify close reference genomes to strains present in a sample. call StrainGR: strain-aware variant caller for metagenomic samples view View call statistics stored in a HDF5 file and output results to different file formats compare Compare strains and variant calls in two different samples. Reads of both samples must be aligned to the same reference. tree Build an approximate phylogenetic tree based on a given distance matrix, using neighbour joining. stats Obtain statistics about a given k-mer set. plot Generate plots for a given k-mer set. [user@cn3144 ~]$ straingr usage: straingr [-h] [--version] [-v] {prepare-ref,call,view,compare,dist,tree} ... ================================ StrainGE: Strain Genome Explorer ================================ A set of tools for strain-level analysis in mixed metagenomic samples --------------------------------------------------------------------- Version: 1.3.9 optional arguments: -h, --help show this help message and exit --version show program's version number and exit -v, --verbose Increase verbosity level, number of levels: 0, 1, 2 Subcommands: {prepare-ref,call,view,compare,dist,tree} prepare-ref Prepare a concatenated reference for StrainGR variant calling. call StrainGR: strain-aware variant caller for metagenomic samples view View call statistics stored in a HDF5 file and output results to different file formats compare Compare strains and variant calls in two different samples. Reads of both samples must be aligned to the same reference. dist For all strains across multiple samples close to the same reference genome, calculate the pairwise genetic distance and output it in matrix form. tree Build an approximate phylogenetic tree based on a given distance matrix, using neighbour joining. [user@cn3144 ~]$ straingst usage: straingst [-h] [--version] [-v] {kmerize,kmersim,kmermerge,cluster,createdb,stats,plot,run} ... ================================ StrainGE: Strain Genome Explorer ================================ A set of tools for strain-level analysis in mixed metagenomic samples --------------------------------------------------------------------- Version: 1.3.9 optional arguments: -h, --help show this help message and exit --version show program's version number and exit -v, --verbose Increase verbosity level, number of levels: 0, 1, 2 Subcommands: {kmerize,kmersim,kmermerge,cluster,createdb,stats,plot,run} kmerize K-merize a given reference sequence or a sample read dataset. kmersim Compare k-mer sets with each other. Both all-vs-all and one-vs-all is supported. kmermerge Merge k-mer set files. cluster Group k-mer sets that are very similar to each other together. createdb Create pan-genome database in HDF5 format from a list of k-merized strains. stats Obtain statistics about a given k-mer set. plot Generate plots for a given k-mer set. run StrainGST: strain genome search tool. Identify close reference genomes to strains present in a sample.
End the interactive session:
[user@cn3111 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$