High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed

Rosetta Software Logo from Baker Laboratory The Rosetta++ software suite focuses on the prediction and design of protein structures, protein folding mechanisms, and protein-protein interactions. The Rosetta codes have been repeatedly successful in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) competition as well as the CAPRI competition and have been modified to address additional aspects of protein design, docking and structure.

There are multiple versions of Rosetta available. An easy way of selecting the version is to use modules. To see the modules available, type

module avail rosetta

To select a module, type

module load rosetta/[ver]

where [ver] is the version of choice.

Environment variables set by this module:

Interactive use

To run a set of demos, do the following:

[biowulf]$ sinteractive -n 1
$ tar xzvf $ROSETTA3_HOME/../rosetta3_demos.tgz
$ ./run_demos.sh

The script run_demos.sh will run through common protocols, each taking no more than a few minutes to complete. The input and output will give a good idea on how to use Rosetta 3.x.

The main Rosetta v3.x executables are:


Performs de novo protein structure prediction


Identifies low free energy sequences for target protein backbones


Predicts the structure of a protein-protein complex from the individual structures of the monomer components


Score a structure with the Rosetta energy function


Relaxes a structure into a minimal energy state


Build and score internal loops for homology modelling

In addition, there are protocols for:

Fragment Files

Fragment files can be generated locally using the make_fragments.pl script. This will generate three secondary structure predictions using SAM, Psipred, and Porter.

Fragment files can also be generated at the Robetta Server Site.

Supporting Programs and Scripts

Here are some supporting programs and scripts for streamlining certain tasks:

File Manipulation

Manipulate input and output files


Evaluating Rosetta output


Cluster decoys and models


Create a batch input file, e.g. 'rosettaRun.sh':

module load rosetta
relax @flags > relax.log

Submit this job like this:

sbatch rosettaRun.sh


Create a swarmfile, e.g. 'rosetta.swarm':

AbinitioRelax @flags -out:file:silent abinito1.out > abinitio1.log
AbinitioRelax @flags -out:file:silent abinito2.out > abinitio2.log
AbinitioRelax @flags -out:file:silent abinito3.out > abinitio3.log
AbinitioRelax @flags -out:file:silent abinito4.out > abinitio4.log
AbinitioRelax @flags -out:file:silent abinito5.out > abinitio5.log
AbinitioRelax @flags -out:file:silent abinito6.out > abinitio6.log
AbinitioRelax @flags -out:file:silent abinito7.out > abinitio7.log
AbinitioRelax @flags -out:file:silent abinito8.out > abinitio8.log

Submit this job using the 'swarm' command. Example:

swarm -f rosetta.swarm --module rosetta

Run as an MPI batch job

The Rosetta executables have been compiled to utilize MPI. Not every executable can parallelize, but the basic abinitio works well. You will need to load a different module (*.mpi) instead of the default.

Create a flags file, for example:

[biowulf]$ cat flags
-in:file:native frags/1crn.fasta
-in:file:fasta frags/t001_.fasta
-in:file:frag3 frags/t001_.200.3mers
-in:file:frag9 frags/t001_.200.9mers
-psipred_ss2 frags/t001_.psipred_ss2
-out:nstruct 4
-out:file:silent silent.out 

Create a batch input file, e.g. 'rosettaRunMPI.sh':

module load rosetta/2016.37.mpi
mpirun -np $SLURM_NTASKS AbinitioRelax @flags

Submit to the cluster using --ntasks=N, where N is equal to or less than (number of decoys + 1).

[biowulf]$ ls
flags         rosettaRunMPI.sh
[biowulf]$ sbatch --ntasks=5 rosettaRunMPI.sh

If --ntasks is equal to decoys + 1, then exactly that number of silent files will be written. Otherwise, the silent files will contain multiple decoys.

NOTE:The number of decoys MUST be equal to or greater than the number of processors, as set by the -np option to mpirun. Otherwise the program will exit with errors.

mpirun noticed that process rank 6 with PID 0 on node cn0002 exited on signal 11 (Segmentation fault).

File Manipulation

cat_silent.pl: concatenate silentfiles
changeChain: change the chain id of a PDB
createLoop.pl: create a dummy structure from a sequence of amino acids
createTemplate.pl: create a homology model template from a FASTA file and a homologous structure


VMD: X-Windows molecular graphics viewer
getColumn.pl: display silentfile and scorefile columns
gnuplot: graphically display data
cluster_plot.pl: generate a gnuplot input file to plot the score versus another field
histogram.pl: generate a quick histogram from STDIN data


cluster_pdbs.pl: cluster a set of PDBs
cluster_variation.pl: find per-residue variation within a cluster