NIH HPC Systems

The High Performing Computation (HPC) group at the National Institutes of Health provides computational resources and support for the NIH intramural research community.

The Biowulf cluster

The Biowulf cluster is a 95,000+ core/30+ PB Linux cluster. Biowulf is designed for large numbers of simultaneous jobs common in the biosciences, as well as large-scale distributed memory tasks such as molecular dynamics. A wide variety of scientific software is installed and maintained on Biowulf, along with scientific databases. See our hardware page for more details. Any scientific computation should be run on cluster compute nodes as batch jobs or sinteractive sessions.. Compute nodes can access http and ftp sites outside our network via a proxy so that some data transfer jobs can be run on the cluster.

The login node

The login node (biowulf.nih.gov) is used to submit jobs to the cluster. Users connect to this system via ssh or NX. No compute intensive, data transfer or large file manipulation processes should be run on the login node. This system is for submitting jobs only.

Helix

Helix (helix.nih.gov) is the interactive data transfer and file management node for the NIH HPC Systems. Users should run all such processes (scp, sftp, Aspera transfers, rsync, wget/curl, large file compressions, etc.) on this system. Scientific applications are not available on Helix. Helix is a 48 core (4 X 3.00 GHz 12-core Xeon™ Gold 6136) system with 1.5 TB of main memory running RedHat Enterprise Linux 7 and has a direct connection to the internet.

Each user on Helix is restricted to 6 CPUs and 32GB of memory. That means that any file transfer or compression processes should not use more than 6 threads total to run efficiently and avoid overloading the network and file systems. For example, an SRA download uses 6 threads by default, so a user should not run more than 1 such download at a time.

HPCdrive

The hpcdrive service allows users on the NIH network to mount their home, data, and shared directories as mapped network drives on their local workstations.

Web tools and docs

https://hpc.nih.gov is the central hub for our web-based tools, dashboard monitor, and documentation for applications and reference data available on the Helix and the Biowulf cluster.

Globus

Globus is a file transfer service that makes it easy to move, sync and share large amounts of data within the NIH as well as with other sites.

Proxy

The http and ftp proxies allow users to fetch data from the internet on compute nodes with tools like wget, curl, and ftp.

Differences between Helix, Biowulf Login and Cluster Compute Nodes

Helix Biowulf Login Node Biowulf Cluster Compute Nodes
Purpose Dedicated data transfer system. No scientific programs can be run. Submission of jobs. No scientific programs.. Most computational processes, run via batch jobs or sinteractive sessions.
Network direct connection to the NIH network (and internet) connects to the NIH network (and internet) via proxy server
System Single system shared by all users Single system shared by all users 4000+ compute nodes with a total of 100,000+ compute cores. CPUs and memory for a job are dedicated to that job during its walltime and do not compete with other users.
/scratch vs /lscratch /scratch is accessible /scratch is accessible /scratch is not accessible from the compute nodes. Each node has its own local disk (/lscratch) which can be allocated to a job (more info)

	Helix	Biowulf Login Node	Biowulf Cluster Compute Nodes
Purpose	Dedicated data transfer system. No scientific programs can be run.	Submission of jobs. No scientific programs..	Most computational processes, run via batch jobs or sinteractive sessions.
Network	direct connection to the NIH network (and internet)	connects to the NIH network (and internet) via proxy server
System	Single system shared by all users	Single system shared by all users	4000+ compute nodes with a total of 100,000+ compute cores. CPUs and memory for a job are dedicated to that job during its walltime and do not compete with other users.
/scratch vs /lscratch	/scratch is accessible	/scratch is accessible	/scratch is not accessible from the compute nodes. Each node has its own local disk (/lscratch) which can be allocated to a job (more info)