shmlast is a reimplementation of the Conditional Reciprocal Best Hits algorithm for finding potential orthologs between a transcriptome and a species-specific protein database. It uses the LAST aligner and the pydata stack to achieve much better performance while staying in the Python ecosystem.
Allocate an interactive session and run the program. Sample session:
[user@biowulf]$ sinteractive --mem=4g [user@@cn3200 ~]$module load shmlast [+] Loading singularity 3.10.0 on cn1113 [+] Loading shmlast 1.6 [user@biowulf]$ shmlast usage: shmlast [-h] [--version] {rbl,crbl} ... shmlast is a reimplementation of the Conditional Reciprocal Best Hits algorithm for finding potential orthologs between a transcriptome and a species-specific protein database. It uses the LAST aligner and the pydata stack to achieve much better performance while staying in the Python ecosystem. positional arguments: {rbl,crbl} optional arguments: -h, --help show this help message and exit --version show program's version number and exit [user@biowulf]$End the interactive session:shmlast 1.2.1 -- Camille Scott, 2016 ------------------------------------ subcommand: Conditional Reciprocal Best LAST doit action: run --- Begin Task Execution --- . rename:/usr/local/apps/shmlast/1.2.1/sample_data/test-transcript.fa: * Python: rename_input . rename:/usr/local/apps/shmlast/1.2.1/sample_data/test-protein.fa: * Python: rename_input . translate:.test-transcript.fa: * Python: function translate_fastx . lastdb:.test-transcript.fa.pep: * Cmd: `/usr/local/apps/shmlast/1.2.1/bin/lastdb -p -w3 .test-transcript.fa.pep .test-transcript.fa.pep` . lastdb:.test-protein.fa: * Cmd: `/usr/local/apps/shmlast/1.2.1/bin/lastdb -p -w3 .test-protein.fa .test-protein.fa` . lastal:.test-protein.fa.x.test-transcript.fa.pep.maf: * Cmd: `cat .test-protein.fa | /usr/local/apps/parallel/20171222/bin/parallel --round-robin --pipe -L 2 -N 10000 --gnu -j 1 -a .test-protein.fa /usr/local/apps/shmlast/1.2.1/bin/lastal -D100000.0 .test-transcript.fa.pep > .test-protein.fa.x.test-transcript.fa.pep.maf` . lastal:.test-transcript.fa.pep.x.test-protein.fa.maf: * Cmd: `cat .test-transcript.fa.pep | /usr/local/apps/parallel/20171222/bin/parallel --round-robin --pipe -L 2 -N 10000 --gnu -j 1 -a .test-transcript.fa.pep /usr/local/apps/shmlast/1.2.1/bin/lastal -D100000.0 .test-protein.fa > .test-transcript.fa.pep.x.test-protein.fa.maf` . fit_and_filter_crbl_hits: * Python: do_crbl_fit_and_filter
[user@cn3200 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. shmlast.sh). For example:
#!/bin/bash #SBATCH --mem=4g module load shmlast cd /data/$USER shmlast crbl -q $SHMLAST_DATA/test-transcript.fa -d $SHMLAST_DATA/test-protein.fa
Submit this job using the Slurm sbatch command.
sbatch shmlast.sh
Create a swarmfile (e.g. shmlast.swarm). For example:
#!/bin/bash module load shmlast cd /data/$USER shmlast crbl -q $SHMLAST_DATA/test-transcript.fa -d $SHMLAST_DATA/test-protein.fa
Submit this job using the swarm command.
swarm -f shmlast.swarm -g 4