Biowulf High Performance Computing at the NIH

Additional information for NIMH users of the NIH Biowulf Cluster

NIMH has funded 64 nodes (1024 physical cores, 2048 hyperthreaded cores) in the Biowulf cluster, and NIMH users with Biowulf accounts have priority access to these nodes. This priority status will last until September 3, 2018.

Also funded is a 60-core (120 hyperthreaded) interactive system Felix (not to be confused with Helix!), named after Robert H. Felix, the first Director of NIMH. This system is available to NIMH users with Helix accounts.

back to top

Compute nodes:

  • 2 x Xeon Intel E5-2650v2, 8-cores @ 2.60GHz, hyperthreading enabled
  • 128 GB memory
  • 815 GB SSD (solid-state) disk
  • 1 Gb/s ethernet (will be upgraded to 10Gb/s when integrated into Biowulf2 Cluster, 2015)
  • Felix:

  • 4 x Xeon Intel E7-4880v2, 15-cores @ 2.50GHz, hyperthreading enabled
  • 1 TB memory
  • 690 GB SSD (solid-state) disk
  • 1 Gb/s ethernet (will be upgraded to 10Gb/s when integrated into Biowulf2 Cluster, 2015)
  • 1 NVIDIA Quadro K6000, 12 GB GDDR5 (384-bit), 2880 CUDA Cores
  • Hyperthreading
    back to top

    Hyperthreading is a hardware feature of the Xeon processor that allows each physical core to run two simultaneous threads of execution thereby appearing to double the number of real cores. Thus the 16-core NIMH nodes will appear to have 32 cores. In many cases this will increase the performance of applications that can multi-thread or otherwise take advantage of multiple cores. However before running 32 threads of execution on a single node, the Biowulf staff recommends that you benchmark your application to determine whether it can take advantage of hyperthreading or not. (Or even whether it scales to 16 cores!).

    Submitting jobs to the NIMH partition
    back to top

    Jobs are submitted to the NIMH nodes by specifying the "nimh" partition. In the simplest case,

    sbatch --partition=nimh your_batch_script

    Submitting a job requiring 128 GB of local scratch on a solid-state disk,

    sbatch --partition=nimh --constraint=ssd800 --gres=lscratch:128 your_batch_script

    Allocating an interactive node:

    sinteractive --constraint=nimh

    To submit a set of swarm jobs,

    swarm -f command_file --partition nimh

    Note that jobs submitted to the NIMH queue will not run on non-NIMH nodes. If there are no NIMH nodes available, the job will remain queued until NIMH nodes become free.

    Core Limits
    back to top

    The current per-user core limit on the NIMH queue can be seen via the 'batchlim' command.

    biowulf% batchlim
    Partition        MaxCPUsPerUser     DefWalltime     MaxWalltime
    nimh                  1024          UNLIMITED         UNLIMITED

    Node Availablity
    back to top

    While approved NIMH users have priority access to the NIMH nodes, they will be accessible by other Biowulf users by virtue of the existence of a "quick" queue. Nodes not in use by NIMH users may be allocated for quick queue jobs for up to 2 hours. That is, no NIMH job will be queued for more than 2 hours waiting for nodes allocated to quick queue jobs.

    To see how many nodes of each type are available use the freen command; there is now a separate section which reports the number of available NIMH nodes:

    biowulf% freen
                                               ........Per-Node Resources........
    Partition    FreeNds       FreeCPUs       Cores CPUs   Mem   Disk    Features
    nimh        62/64        2014/2048         16    32    125g   800g   cpu32,core16,g128,ssd800,x2650,10g,nimh

    Using Felix
    back to top

    This system should be used for interactive and relatively lightweight computations.

    To access Felix, ssh to using your NIH credentials from anywhere on the NIH VPN. You can also connect to via NX. To set up the connection:

    Protocal: NX
    Authentication: Use System login
    Proxy: none
    Create a link: up to you

    Additional notes about Felix:

  • It's running CentOS 6 (as opposed to CentOS 5 on the compute nodes).
  • /scratch is the global shared scratch as is is on helix and biowulf.
  • /lscratch (local scratch) is the local SSD (690 GB usable). Since this is a shared system, the clearscratch command will have no effect.
  • Files in /lscratch that have not been modified for 2 weeks will be automatically deleted.
  • To reduce "wear" on the SSD access time attributes have been disabled for the local filesystems (including /lscratch).
  • You may submit Biowulf Cluster jobs from Felix, with the exception of Interactive sessions.
  • Advanced graphics visualization on Felix
    back to top

    Felix has graphics hardware for users who need increased visualization capabilities. Applications that previously ran sluggishly or not at all through software OpenGL (which uses your own computer's graphics hardware), can now take advantage of server-side hardware acceleration on an NVIDIA QUADRO K6000. For example, MATLAB rendering will work more smoothly using this advanced graphics visualization.

    NoMachine is the easiest method for using server-side hardware accelerated graphics on Felix. You may also use TurboVNC. TurboVNC is a bit harder to set up, but if you have a fast network connection (greater than or equal to 1Gbps) you may find that it provides faster graphics performance than NoMachine. (You can learn about the way that graphics are displayed from a remote Linux server like Felix here. Both NoMachine and TurboVNC utilize VirtualGL, but NoMachine uses additional compression to reduce bandwidth on slow networks.)

    OpenGL applications must be initiated with the vglrun wrapper script to take advantage of server-side hardware acceleration. For instance, if you want to run MATLAB with server-side hardware support for OpenGL rendering, you would start it like this. (User input in bold):

    [user@felix ~]$ module load matlab
    [+] Loading Matlab 2016b on
    [+] Loading Zlib 1.2.8 ...
    [user@felix ~]$ vglrun matlab -nohardwareopengl

    You can test to make sure that your MATLAB session recognizes the graphics hardware with the opengl info command like so:

    >> opengl info
                          Version: '4.5.0 NVIDIA 367.44'
                           Vendor: 'NVIDIA Corporation'
                         Renderer: 'Quadro K6000/PCIe/SSE2'
                   MaxTextureSize: 16384
                           Visual: 'Visual 0xaf, (RGBA 32 bits (8 8 8 8), Z depth 16 bits, Hardware acceleration, [...]
                         Software: 'false'
             HardwareSupportLevel: 'full'
        SupportsGraphicsSmoothing: 1
    SupportsDepthPeelTransparency: 1
       SupportsAlignVertexCenters: 1
                       Extensions: {330x1 cell}
               MaxFrameBufferSize: 16384

    If this command returns information about 'Mesa X11' instead or the Renderer field is empty, something is wrong and the MATLAB session has reverted to software OpenGL instead of using the hardware in Felix.

    To get an idea of how much hardware acceleration can speed rendering, consider the examples below. These figures were generated in two MATLAB sessions running at the same time. The MATLAB code renders a topgraphcal map of the earth texture mapped onto a sphere with lighting, rotates the camera angle and renders again as fast as possible. The top figure did not use hardware acceleration while the bottom figure did.

    NIMH Partition - Last 24 hrs

    NIMH Partition - Last month

    NIMH Partition - Last year

    Felix Utilization - Last 24 hrs

    Felix Utilization - Last month

    Felix Utilization - Last Year

    Please send questions and comments to