DeepLoc2 uses deep learning to predict subcellular localization of eukaryotic proteins.
DeepLoc 2.0 predicts the subcellular localization(s) of eukaryotic proteins. DeepLoc 2.0 is a multi-label predictor, which means that is able to predict one or more localizations for any given protein. It can differentiate between 10 different localizations: Nucleus, Cytoplasm, Extracellular, Mitochondrion, Cell membrane, Endoplasmic reticulum, Chloroplast, Golgi apparatus, Lysosome/Vacuole and Peroxisome. Additionally, DeepLoc 2.0 can predict the presence of the sorting signal(s) that had an influence on the prediction of the subcellular localization(s).
Allocate an interactive session and run the program on the test data. Then compare it with the test data result using diff.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --mem=24G --gres=lscratch:5 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144 ~]$ module load deeploc [user@cn3144 ~]$ mkdir -p /data/$USER/.cache/torch/hub [user@cn3144 ~]$ cp -r $DEEPLOC_TRAIN_DATA/checkpoints /data/$USER/.cache/torch/hub/ [user@cn3144 ~]$ cd /lscratch/$SLURM_JOB_ID [user@cn3144 ~]$ cp $DEEPLOC_TEST_DATA/test.fasta . [user@cn3144 ~]$ deeploc2 -f test.fasta [user@cn3144 ~]$ diff outputs/results_20230101-000000.csv $DEEPLOC_TEST_DATA/results_test.csv [user@cn3144 ~]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf ~]$
Create a batch input file (e.g. deeploc.sh). For example:
#!/bin/bash set -e module load deeploc cd /data/$USER deeploc2 -f input.fasta
Submit this job using the Slurm sbatch command.
sbatch [--mem=#] deeploc.sh
Create a swarmfile (e.g. deeploc.swarm). For example:
deeploc2 -f 01.fasta -o results_01 deeploc2 -f 02.fasta -o results_02 deeploc2 -f 03.fasta -o results_03 deeploc2 -f 04.fasta -o results_04
Submit this job using the swarm command.
swarm -f deeploc.swarm [-g #] --module deeplocwhere
-g # | Number of Gigabytes of memory required for each process (1 line in the swarm command file) |
--module deeploc | Loads the deeploc module for each subjob in the swarm |