Biowulf High Performance Computing at the NIH
randfold on Biowulf


Randfold computes the probability that, for a given RNA sequence, the Minimum Free Energy (MFE) of the secondary structure is different from a distribution of MFE computed with random sequences obtained by permuting the input sequence.

RandFold is not a parallel program. Small numbers of Randfold jobs, or interactive Randfold runs, can be run on helix or on biowulf interactive nodes. If you have many Randfold jobs to run, the swarm utility is recommended.


Web sites

Running randfold on Helix

Example: running randfold on a single miRNA sequence

helix$ module load randfold
helix$ randfold
FATAL: Usage: randfold <method> <file name> <number of randomizations>

-s simple mononucleotide shuffling
-d dinucleotide shuffling
-m markov chain 1 shuffling

Example: randfold -d let7.tfa 999

helix$ cat > cel-let7.fa <<EOF
>cel-let-7 Caenorhabditis elegans let-7 precursor RNA
helix$ randfold -d cel-let7.fa 999
cel-let-7       -42.90  0.001000
Running a single randfold batch job on Biowulf2

Running randfold on a set of mouse miRNA sequences in series (randfold will process one sequence at a time).

First, let's create an example input data set - all putative mouse miRNA hairpins from mirBase:

biowulf2$ cd /data/$USER/test_data/randfold
biowulf2$ wget ""
biowulf2$ gunzip hairpin.fa.gz
biowulf2$ module load emboss
biowulf2$ seqret -outseq mouse_hairpins.fa hairpin.fa:mmu-*

Then set up a batch file

#! /bin/bash
#SBATCH --job-name=randfold
set -e

module load randfold
randfold -d $inf 99 > $outf
The batch script is submitted for processing with

Running a swarm of randfold batch jobs on Biowulf2

Again, first create a set of input files. This time, one hairpin per file so swarm can parallelize over all hairpins:

biowulf2$ cd /data/$USER/test_data/randfold
biowulf2$ wget ""
biowulf2$ gunzip hairpin.fa.gz
biowulf2$ module load emboss
biowulf2$ mkdir mouse_hairpins
biowulf2$ seqret -osdirectory2 mouse_hairpins -ossingle2 -auto hairpin.fa:mmu-*

Then set up a swarm file

randfold -d mouse_hairpins/mmu-let-7a-1.fasta 999 > mouse_hairpins/mmu-let-7a-1.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7a-2.fasta 999 > mouse_hairpins/mmu-let-7a-2.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7b.fasta 999 > mouse_hairpins/mmu-let-7b.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7c-1.fasta 999 > mouse_hairpins/mmu-let-7c-1.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7c-2.fasta 999 > mouse_hairpins/mmu-let-7c-2.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7d.fasta 999 > mouse_hairpins/mmu-let-7d.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7e.fasta 999 > mouse_hairpins/mmu-let-7e.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7f-1.fasta 999 > mouse_hairpins/mmu-let-7f-1.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7f-2.fasta 999 > mouse_hairpins/mmu-let-7f-2.fasta.randfold
randfold -d mouse_hairpins/mmu-let-7g.fasta 999 > mouse_hairpins/mmu-let-7g.fasta.randfold

And run it with swarm's default settings

biowulf2$ swarm -f swarmfile --module randfold
Running an interactive job on Biowulf2

After starting an interactive sesssion on a compute node with sinteractive, randfold is used as described above. For example

biowulf2$ sinteractive
salloc.exe: Granted job allocation nnnnnn
srun: error: x11: no local DISPLAY defined, skipping
cn0147$ module load randfold
cn0147$ randfold -d cel-let7.fa 999
cel-let-7       -42.90  0.001000
cn0147$ exit