Accurate prediction of protein structures and interactions using a three-track neural network, in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated.
cp -r ${RFAA_CONF:-none} /data/$USER/
module load RoseTTAFold cp -r ${ROSETTAFOLD_NETWORK:-none} ~/
cp -r ${ROSETTAFOLD_WEIGHTS:-none} ~/
cp -r ${ROSETTAFOLD_NETWORK_2TRACK:-none} ~/
Allocate an interactive session and run the program.
Sample session (user input in bold):
[user@biowulf]$ sinteractive --cpus-per-task=10 --mem=60G salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ module load RoseTTAFold [user@cn3144]$ mkdir /data/$USER/rosettafold_test/ [user@cn3144]$ cd /data/$USER/rosettafold_test/ [user@cn3144]$ cp -r ${ROSETTAFOLD_TEST_DATA:-none}/* . [user@cn3144]$ run_e2e_ver_part1.sh input.fa e2e_out Running HHblits Running PSIPRED Running hhsearch Running end-to-end prediction Done with part1, please run part2 on GPU node [user@cn3144]$ run_pyrosetta_ver_part1.sh input.fa pyrosetta_out Running HHblits Running PSIPRED Running hhsearch Predicting distance and orientations Running parallel RosettaTR.py Done with part1, please run part2 at GPU node [user@cn3144 ]$ exit salloc.exe: Relinquishing job allocation 46116226 [user@biowulf]$ sinteractive --cpus-per-task=2 --mem=10g --gres=gpu:p100:1 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ module load RoseTTAFold [user@cn3144]$ cd /data/$USER/rosettafold_test/ [user@cn3144]$ run_e2e_ver_part2.sh input.fa e2e_out run_e2e_ver_part2.sh input.fa e2e_out Running end-to-end prediction Done with part2 (prediction) [user@cn3144]$ run_pyrosetta_ver_part2.sh input.fa pyrosetta_out Picking final models Final models saved in: pyrosetta_out/model Done with part2 (pick final models)
For PPI screening using faster 2-track version:
[user@biowulf]$ sinteractive --cpus-per-task=2 --mem=10g --gres=gpu:p100:1 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ module load RoseTTAFold [user@cn3144]$ mkdir /data/$USER/rosettafold_test/ [user@cn3144]$ cd /data/$USER/rosettafold_test/ [user@cn3144]$ cp -r ${ROSETTAFOLD_TEST_DATA:-none}/* . [user@cn3144]$ cd complex_2track [user@cn3144]$ python ~/network_2track/predict_msa.py -msa input.a3m -npz complex_2track.npz -L1 218
Create a batch input file (e.g. rosettafold.sh). For example:
#!/bin/bash
set -e
module load RoseTTAFold
cd /data/$USER/rosettafold_test/
cp -r ${ROSETTAFOLD_TEST_DATA:-none}/* .
cd complex_modeling
python ~/network/predict_complex.py -i paired.a3m -o complex3 -Ls 218 310
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=2 --mem=10g --partition=gpu --gres=gpu:v100x:1 rosettafold.sh
[user@biowulf]$ sinteractive --cpus-per-task=4 --mem=16g --gres=gpu:p100:1 salloc.exe: Pending job allocation 46116226 salloc.exe: job 46116226 queued and waiting for resources salloc.exe: job 46116226 has been allocated resources salloc.exe: Granted job allocation 46116226 salloc.exe: Waiting for resource configuration salloc.exe: Nodes cn3144 are ready for job [user@cn3144]$ module load RoseTTAFold/allatom [user@cn3144]$ cd /data/$USER/ [user@cn3144]$ cp -r ${ROSETTAFOLD_TEST_DATA:-none} . [user@cn3144]$ python -m rf2aa.run_inference --config-name protein [user@cn3144]$ cp -r ${RFAA_CONF:-none} . # cp config and modify to use custmized input [user@cn3144]$ python -m rf2aa.run_inference \ --config-name protein \ --config-path /data/$USER/config/inference
Create a batch input file (e.g. rosettafold.sh). For example:
#!/bin/bash
set -e
module load RoseTTAFold/allatom
cd /data/$USER/rosettafold_test/
cp -r ${ROSETTAFOLD_TEST_DATA:-none} .
python -m rf2aa.run_inference --config-name nucleic_acid
Submit this job using the Slurm sbatch command.
sbatch --cpus-per-task=2 --mem=10g --partition=gpu --gres=gpu:v100x:1 rosettafold.sh