High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
NAMD on Biowulf & Helix

NAMD is a parallel molecular dynamics program for UNIX platforms designed for high-performance simulations in structural biology. It is developed by the Theoretical Biophysics Group at the Beckman Center, University of Illinois.

NAMD was developed to be compatible with existing molecular dynamics packages, especially the packages X-PLOR and CHARMM, so it will accept X-PLOR and CHARMM input files. The output files produced by NAMD are also compatible with X-PLOR and CHARMM.

NAMD is closely integrated with with VMD for visualization and analysis.

Important: Please read the webpage Making efficient use of Biowulf's Multinode Partition before running large parallel jobs.

On Helix

NAMD is a compute-intensive program that is not suitable for Helix.

Batch job on Biowulf

The following example uses the ApoA1 benchmark example from the NAMD site. It is available on Biowulf in
/usr/local/apps/NAMD/apoa1.tar.gz.

Specifying a homogenous set of nodes

The 'multinode' partition, to which all jobs that require more than a single node must be submitted, is heterogenous. For efficient parallel jobs, you need to ensure that you request nodes of a single CPU type. For example, at the time of writing this webpage, the 'freen' command displays:

biowulf% freen
...
multinode   65/466       3640/26096        28    56    248g   400g   cpu56,core28,g256,ssd400,x2695,ibfdr
multinode   4/190        128/6080          16    32     60g   800g   cpu32,core16,g64,ssd800,x2650,ibfdr
multinode   312/539      17646/30184       28    56    250g   800g   cpu56,core28,g256,ssd800,x2680,ibfdr
...
These lines indicate that there are 3 kinds of nodes in the multinode partition. You should submit your job exclusively to one kind of node by specifying --constraint=x2695, --constraint=x2650, or --constrant=x2680 as in the examples below.

Sample batch script for the ibverbs version:

#!/bin/bash

cd /data/$USER/mydir

module load NAMD/2.12-ibverbs
make-namd-nodelist
charmrun ++nodelist ~/namd.$SLURM_JOBID ++p $SLURM_NTASKS `which namd2` +setcpuaffinity  input.namd

# delete the NAMD-specific node list
rm ~/namd.$SLURM_JOBID
Note: The NAMD +setcpuaffinity flag should be used for the ibverbs version for performance improvement. This flag should not be used when running the OpenMPI/Intel compiled version, since OpenMPI enforces its own cpu affinity.

Sample batch script for the OpenMPI 2.0/Intel-compiler version compiled on Biowulf
Note: in our benchmarks, this version was slightly slower than the ibverbs version, so most users will want to use the ibverbs version.

#!/bin/bash
#
cd /data/$USER/mydir
module load NAMD/2.12-openmpi

mpirun -np $SLURM_NTASKS `which namd2`    input.namd

Submit this job with:

sbatch --partition=multinode --constraint=x2650 --ntasks=# --ntasks-per-core=1 --time=168:00:00 --exclusive jobscript

where:

--partition=multinode
Submit to the IB-FDR partition. Highly parallel programs like NAMD with lots of interprocess communication should be run on the Infiniband (IB) network
--constraint=x2650
Request only x2650 nodes in the multinode partition. A heterogenous set of nodes is likely to lower performance.
--ntasks=#
Specifies the number of NAMD processes to run. This is passed into the script via the $SLURM_NTASKS variable.
The number of tasks should be (number of nodes) * (number of physical cores per node).
e.g. for the x2695 nodes, ntasks should be (number of nodes) * 28.
--ntasks-per-core=1
Specifies that each NAMD process should run on a physical core. Hyperthreading is ignored. This parameter is highly recommended for parallel jobs.
--time=168:00:00
Specifies a walltime limit of 168 hrs = 1 week. See the section on chaining jobs below.
--exclusive
specifies that all nodes allocated to this job should be allocated exclusively. This is recommended for multi-node parallel jobs.

Due to a technical complication, 'jobload' may report incorrect results for a NAMD parallel job. Here is a typical NAMD ibverbs run with the jobload showing as 0:

           JOBID            TIME            NODES  CPUS  THREADS   LOAD       MEMORY
                     Elapsed / Wall               Alloc   Active           Used /     Alloc
        32072214    00:03:29 /    08:00:00 cn1517    56        0     0%     0.0 /   56.0 GB
                    00:03:29 /    08:00:00 cn1518     0        0     0%     0.0 /    0.0 GB
                 Nodes:    2    CPUs:  112  Load Avg:   0%
However, 'rsh cn1517 ps -C namd2' will show that there are 28 namd2 processes on each node. The NAMD output file will also report details such as:
Charm++> Running on 2 unique compute nodes (56-way SMP).

On GPUs

Single-node performance on CPU+GPU is better than on a standard node. Based on our benchmarks, the jobs do not scale to more than 1 GPU node, and smaller molecular systems may only scale efficiently to 1 or 2 GPU devices on a GPU node. Thus, the performance of an individual job depends on the size of the system and may depend on other simulation factors. It is therefore vital to run your own benchmarks to determine how many GPU devices and nodes to submit to.

Single-node GPU job:
To run a single-node GPU job, that will run on a single K20x node or K80 node, create a batch script along the following lines:

#!/bin/bash

cd /data/$USER/mydir

module load  NAMD/2.12-gpu

charmrun ++local  ++p $SLURM_NTASKS ++ppn $SLURM_NTASKS `which namd2` +setcpuaffinity  +idlepoll   input.namd 

To submit to both GPUs and all 16 physical cores on a k20x node:

sbatch --partition=gpu --gres=gpu:k20x:2 --ntasks=16 --ntasks-per-core=1 --exclusive jobscript

To submit to 2 GPU devices and half the CPUs on a K80 node:

sbatch --partition=gpu --gres=gpu:k80:2 --ntasks=14 --ntasks-per-core=1 --exclusive jobscript

To submit to all 4 GPU devices and all the CPUs on a K80 node:

sbatch --partition=gpu --gres=gpu:k80:4 --ntasks=28 --ntasks-per-core=1 --exclusive jobscript

As per the NAMD GPU documentation, multiple NAMD threads can utilize the same set of GPUs, and the tasks are equally distributed among the allocated GPUs on a node.

Multi-node GPU job

While it is possible to run a multinode GPU NAMD job, please be sure that your NAMD job scales to more than 1 GPU node before submitting multinode GPU jobs. (See our benchmarks for details). To submit a multinode job, you could use a script like the following:

#!/bin/bash

cd /data/$USER/mydir

module load  NAMD/2.12-gpu

# on a K20x node
charmrun ++local ++p $SLURM_NTASKS ++ppn 16 `which namd2` +setcpuaffinity  +idlepoll   input.namd 

# on a K80 node
charmrun ++local  ++p $SLURM_NTASKS ++ppn 28 `which namd2` +setcpuaffinity  +idlepoll   input.namd 
To submit to 2 k20x nodes:
sbatch ---partition=gpu --gres=gpu:k20x:2 --ntasks=32 --ntasks-per-core=1 --nodes=2 --exclusive jobscript  
To submit to 2 K80 nodes:
sbatch ---partition=gpu --gres=gpu:k80:4 --ntasks=56 --ntasks-per-core=1 --nodes=2 --exclusive jobscript  

Monitoring GPU jobs

To monitor your GPU jobs, use 'jobload' to see the CPU utilization (should be ~ 50%), and 'rsh nodename nvidia-smi' to see the GPU utilization. In the example below, a NAMD job is submitted to 4 GPUs (2 nodes) and 32 cores (all cores on the 2 nodes).

[biowulf]$  sbatch --partition=gpu --gres=gpu:k20x:2 --ntasks=32 --ntasks-per-core=1 --nodes=2 run.gpu
129566
Jobload shows that the job is utilizing all cores:
[biowulf]$  jobload -u susanc
     JOBID      RUNTIME     NODES   CPUS    AVG CPU%            MEMORY
                                                              Used/Alloc
    129566     00:00:26    cn0603     32       50.00    836.9 MB/62.5 GB
               00:00:26    cn0604     32       50.06    644.6 MB/62.5 GB
The 'nvidia-smi' command shows that there are 8 NAMD processes running on each GPU. Each NAMD MPI process will run one process on the GPU, so this is correct. The 'GPU-Util' value will bounce around, so is not very meaningful.
[biowulf]$  rsh cn3084 nvidia-smi
Sun Feb 26 15:19:07 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48                 Driver Version: 367.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           On   | 0000:83:00.0     Off |                  Off |
| N/A   48C    P0    58W / 149W |     91MiB / 12205MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           On   | 0000:84:00.0     Off |                  Off |
| N/A   35C    P0    75W / 149W |     91MiB / 12205MiB |     15%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           On   | 0000:8A:00.0     Off |                  Off |
| N/A   51C    P0    62W / 149W |     90MiB / 12205MiB |     12%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           On   | 0000:8B:00.0     Off |                  Off |
| N/A   39C    P0    76W / 149W |     91MiB / 12205MiB |     14%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0     40816    C   ..._2.12_Linux-x86_64-ibverbs-smp-CUDA/namd2    87MiB |
|    1     40816    C   ..._2.12_Linux-x86_64-ibverbs-smp-CUDA/namd2    87MiB |
|    2     40816    C   ..._2.12_Linux-x86_64-ibverbs-smp-CUDA/namd2    86MiB |
|    3     40816    C   ..._2.12_Linux-x86_64-ibverbs-smp-CUDA/namd2    87MiB |
+-----------------------------------------------------------------------------+
Replica Exchange

Sample Replica Exchange job script:

#!/bin/bash

cd /data/$USER/mydir
module load NAMD/2.12-openmpi

mkdir output
(cd output; mkdir 0 1 2 3 4 5 6 7)
mpirun namd2 +replicas 8 job0.conf +stdout output/%d/job0.%d.log
The number of MPI ranks must be a multiple of the number of replicas. Thus, for the 8 replicas above, you could submit with:
sbatch --partition=multinode --ntasks=24 --ntasks-per-core=1 --nodes=1 --exclusive   jobscript
using 24 of the 28 physical cores on a single node.

Walltimes and chaining jobs

There are walltime limits on most Biowulf partitions. Use 'batchlim' to see the current walltime limits.

An example namd config file for running a second simulation starting from the last timestep and the restart files of a previous simulation is available at http://www.ks.uiuc.edu/~timisgro/sample.conf.

If restarting a NAMD REMD job, be sure to comment out the 'bincoordinates' and 'extendedsystem' parameters in your NAMD configuration file, if applicable

After an initial run has produced a set of restart files, you would submit future runs using a batch script along these lines:

#!/bin/bash

module load NAMD/2.10

# Create host file (required)
make-namd-nodelist
mpirun -n $SLURM_NTASKS  `which namd2` myjob.restart.namd > out.log
rm -f ~/namd.$SLURM_JOBID

# this script resubmits itself to the batch queue. 
# The NAMD config file is set up to start the simulation from the last timestep 
#   in the previous simulation
sbatch --partition=multinode --constraint=x2650 --ntasks=$SLURM_NTASKS --ntasks-per-core=1 --time=168:00:00 --exclusive this_job_script
Submit this script, as usual, with a command like:
sbatch --partition=multinode --constraint=x2650 --ntasks=64 --ntasks-per-node=1 --time=168:00:00 --exclusive this_job_script
The NAMD 2.10 replica.namd file is at /usr/local/apps/NAMD/NAMD_2.10_Linux-x86_64-ibverbs/lib/replica/replica.namd.

Chaining Replica Exchange jobs

(Thanks to Christopher Siwy (CC) for this information).

The most important points here are:
- Ensure you do not delete your replica folders when you run the restart (as this is usually done when you start a new REMD simulation)
- In your job0.conf (or whatever you name it) file, include the following two lines after the line referencing the NAMD configuration

source [format $output_root.job0.restart20.tcl ""]
set num_runs 10000
 
The items in bold will likely be subjective. And the number of runs, is the TOTAL number of runs for the simualtion, not the number of runs to run from that point forward. So in the example above, the restart will begin at the 20th run and continue till it reaches the 10,000th run.

This thread in the NAMD mailing list may help in debugging problems.

Benchmarks

On the Benchmarks page

Documentation

NAMD 2.10 user guide at the UIUC website.

Theoretical and Computational Biophysics group at UIUC, the NAMD/VMD developers.