SonicParanoid on Biowulf
SonicParanoid is a stand-alone software tool for the identification of orthologous relationships among multiple species.
References:
Documentation
Important Notes
- Module Name: sonicparanoid (see the modules page for more information)
- This application is multi-threaded. Control the number of threads with the "--threads N" option/argument pair.
- Example files can be obtained using the command sonicparanoid-get-test-data. (See examples below.)
Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf ~]$ sinteractive -c16 --mem=32g --gres=lscratch:10 salloc.exe: Pending job allocation 59415428 salloc.exe: job 59415428 queued and waiting for resources salloc.exe: job 59415428 has been allocated resources salloc.exe: Granted job allocation 59415428 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3201 are ready for job srun: error: x11: no local DISPLAY defined, skipping [user@cn3201 ~]$ module load sonicparanoid [+] Loading sonicparanoid 1.3.2 on cn3201 [+] Loading singularity 3.5.3 on cn3201 [user@cn3201 ~]$ cd /lscratch/$SLURM_JOB_ID [user@cn3201 59415428]$ sonicparanoid-get-test-data --output-directory . /lscratch/59415428/sonicparanoid_test/ INFO: all test files were succesfully copied to /lscratch/59415428/sonicparanoid_test/ Go inside the directory /lscratch/59415428/sonicparanoid_test/ and type sonicparanoid -i ./test_input -o ./test_output -m fast -t 4 [user@cn3201 59415428]$ cd sonicparanoid_test/ [user@cn3201 sonicparanoid_test]$ tree . |-- test_input | |-- chlamydia_trachomatis | |-- deinococcus_radiodurans | |-- gloeobacter_violaceus | `-- thermotoga_maritima `-- test_output `-- README.txt 2 directories, 5 files [user@cn3201 sonicparanoid_test]$ sonicparanoid \ --input-directory ./test_input \ --output-directory ./test_output \ --mode fast \ --threads $SLURM_CPUS_ON_NODE Run START: Mon Jun 8 17:13:53 2020 SonicParanoid 1.3.2 will be executed with the following parameters: Run ID: sonic_8620171353_fast_16cpus_ml05 Run directory: /lscratch/59415428/sonicparanoid_test/test_output/runs/sonic_8620171353_fast_16cpus_ml05 Input directory: /lscratch/59415428/sonicparanoid_test/test_input/ Input proteomes: 4 Output directory: /lscratch/59415428/sonicparanoid_test/test_output Alignments directory: /lscratch/59415428/sonicparanoid_test/test_output/alignments/ Pairwise tables directory: /lscratch/59415428/sonicparanoid_test/test_output/runs/sonic_8620171353_fast_16cpus_ml05/pairwise_orthologs/ Directory with ortholog groups: /lscratch/59415428/sonicparanoid_test/test_output/runs/sonic_8620171353_fast_16cpus_ml05/ortholog_groups/ Pairwise tables database directory: /lscratch/59415428/sonicparanoid_test/test_output/orthologs_db/ Runs directory: /lscratch/59415428/sonicparanoid_test/test_output/runs/ Update run: False Create pre-filter indexes: True Complete overwrite: False Re-create ortholog tables: False Threads: 16 Memory per thread (Gigabytes): 15.73 Minimum memory per thread (Gigabytes): 1.75 Run mode: fast (MMseqs2 s=2.5) MCL inflation: 1.50 /usr/local/lib/python3.6/dist-packages/sklearn/base.py:334: UserWarning: Trying to unpickle estimator DecisionTreeClassifier from version 0.22.2.post1 when using version 0.23.1. This might lead to breaking code or invalid results. Use at your own risk. UserWarning) /usr/local/lib/python3.6/dist-packages/sklearn/base.py:334: UserWarning: Trying to unpickle estimator AdaBoostClassifier from version 0.22.2.post1 when using version 0.23.1. This might lead to breaking code or invalid results. Use at your own risk. UserWarning) For the 4 input species 6 combinations are possible. 16 MMseqs2 alignments will be performed... Creating 4 MMseqs2 databases... MMseqs2 databases creation elapsed time (seconds): 4.195 All-vs-all alignments elapsed time (seconds): 8.272 Predicting 6 ortholog tables... Ortholog tables creation elapsed time (seconds): 0.26 Creating ortholog groups... Creating orthology matrixes... Ortholog matrixes creation elapsed time (seconds): 0.072 Merging inparalog matrixes... Inparalogs merging elapsed time (seconds): 0.071 Creating input matrix for MCL... MCL graph creation elapsed time (seconds): 0.091 Running MCL... MCL execution elapsed time (seconds): 3.461 Generating final output files... Elapsed time for the creation of final output (seconds): 0.673 Ortholog groups creation elapsed time (seconds): 4.408 Total elapsed time (seconds): 18.217 [user@cn3201 sonicparanoid_test]$ exit exit salloc.exe: Relinquishing job allocation 59415428 salloc.exe: Job allocation 59415428 has been revoked. [user@biowulf ~]$
Batch job
Most jobs should be run as batch jobs.
Create a batch input file (e.g. sonicparanoid.sh). For example:
#!/bin/bash set -e module load sonicparanoid sonicparanoid \ --input-directory ./test_input \ --output-directory ./test_output \ --mode fast \ --threads $SLURM_CPUS_ON_NODE
Submit this job using the Slurm sbatch command.
sbatch [--cpus-per-task=#] [--mem=#] sonicparanoid.sh
Swarm of Jobs
A swarm of jobs is an easy way to submit a set of independent commands requiring identical resources.
Create a swarmfile (e.g. sonicparanoid.swarm). For example:
sonicparanoid -i ./test_input1 -o ./test_output1 -m fast -t $SLURM_CPUS_ON_NODE sonicparanoid -i ./test_input2 -o ./test_output2 -m fast -t $SLURM_CPUS_ON_NODE sonicparanoid -i ./test_input3 -o ./test_output3 -m fast -t $SLURM_CPUS_ON_NODE sonicparanoid -i ./test_input4 -o ./test_output4 -m fast -t $SLURM_CPUS_ON_NODE
Submit this job using the swarm command.
swarm -f sonicparanoid.swarm [-g #] [-t #] --module sonicparanoidwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
-t # | Number of threads/CPUs required for each process (1 line in the swarm command file). |
--module sonicparanoid | Loads the sonicparanoid module for each subjob in the swarm |