rosettafoldna on Biowulf

rosettafoldnaNA: rapidly produces 3D structure models with confidence estimates for protein-DNA and protein-RNA complexes, and for RNA tertiary structures.

References:

Documentation
Important Notes

Interactive job
Interactive jobs should be used for debugging, graphics, or applications that cannot be run as batch jobs.

Allocate an interactive session and run the program.
Sample session (user input in bold):

[user@biowulf]$ sinteractive --cpus-per-task=10 --mem=70G
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job

[user@cn3144]$ module load rosettafoldna
[user@cn3144]$ mkdir /data/$USER/rosettafoldna_test/
[user@cn3144]$ cd /data/$USER/rosettafoldna_test/
[user@cn3144]$ cp -r ${ROSETTAFOLDNA_TEST_DATA:-none}/* .
[user@cn3144]$ tree rosettafoldna_test/
rosettafoldna_test/
├── protein.fa
└── RNA.fa

0 directories, 2 files

[user@cn3144]$ run_RF2NA_part1.sh test_o protein.fa R:RNA.fa
Running HHblits
Running PSIPRED
Running hhsearch
Running rMSA (lite)
Done with part1, please run part2 on GPU node (>= V100)

[user@cn3144]$ exit
salloc.exe: Relinquishing job allocation 46116226
[user@biowulf]$ sinteractive --cpus-per-task=2 --mem=10g --gres=gpu:v100:1
salloc.exe: Pending job allocation 46116226
salloc.exe: job 46116226 queued and waiting for resources
salloc.exe: job 46116226 has been allocated resources
salloc.exe: Granted job allocation 46116226
salloc.exe: Waiting for resource configuration
salloc.exe: Nodes cn3144 are ready for job
[user@cn3144]$ run_RF2NA_part2.sh test_o protein.fa R:RNA.fa
Running RoseTTAFold2NA to predict structures
Running on GPU
  msa[msa == "U"] = 30
           plddt    best
RECYCLE  0   0.874  -1.000
RECYCLE  1   0.892   0.874
RECYCLE  2   0.898   0.892
RECYCLE  3   0.899   0.898
RECYCLE  4   0.901   0.899
RECYCLE  5   0.902   0.901
RECYCLE  6   0.901   0.902
RECYCLE  7   0.902   0.902
RECYCLE  8   0.901   0.902
RECYCLE  9   0.901   0.902
Done2 with part2 (prediction)

Batch job
Most jobs should be run as batch jobs.

Create a batch input file (e.g. rosettafoldna_1.sh) for first step. For example:


#!/bin/bash
set -e
module load rosettafoldna
cd /data/$USER/rosettafoldna_test/
cp -r ${ROSETTAFOLDNA_TEST_DATA:-none}/* .
run_RF2NA_part1.sh test_o protein.fa R:RNA.fa

Then create a batch input file (e.g. rosettafoldna_2.sh) for the second step. For example:


#!/bin/bash
set -e
module load rosettafoldna
cd /data/$USER/rosettafoldna_test/
run_RF2NA_part2.sh test_o protein.fa R:RNA.fa

Submit this job using the Slurm sbatch command.

[user@biowulf]$ sbatch --cpus-per-task=10 --mem=70g rosettafoldna_1.sh
1001
[user@biowulf]$ sbatch --dependency=afterany:1001 --cpus-per-task=2 \
                  --mem=10g --partition=gpu --gres=gpu:v100:1 rosettafoldna_2.sh
1002