The Faraldo-Gómez (NHLBI) and Forrest (NINDS) labs have funded 84 nodes (72 CPU nodes with 64 physical cores, 12 GPU nodes with 4 A100 GPUs each) in the Biowulf cluster, and users from those labs have priority access to these nodes. This priority status will last until October, 2028. The nodes are in the 'forgo' partition |
|
Faraldo-Gómez and Forrest lab members who do not already have an HPC account should fill out the account request form at
https://hpc.nih.gov/nih/accounts/account_request.php.
This form is only accessible from the NIH network or VPN, and requires logging in with your NIH username and password.
Once the PI has approved and the account is set up, all users whose PI is Dr Faraldo-Gómez or Dr Forrest will automatically get priority access to the 'forgo' buyin nodes.
The hardware characteristics of each GPU compute node are as follows:
Jobs are submitted to the forgo nodes by specifying the "forgo" partition. In the simplest case,
sbatch --partition=forgo [other Slurm options] your_batch_script
Since the forgo partition is composed of both CPU and GPU nodes, please use --constraint=e7543
when requesting
CPU-only nodes to ensure that CPU-only jobs do not run on GPU nodes. GPU jubs should use the --gres
flag to request
GPUs, similar to Biowulf's GPU partition. For example, --gres=gpu:a100:2
will request 2 A100 GPUs,
Most of the Slurm commands and customizations used in other Biowulf partitions (e.g.swarm or sinteractive) are supported on the forgo partitions as well.
Symmetric multithreading is a hardware feature of the AMD Epyv processor that allows each physical core to run two simultaneous threads of execution thereby appearing to double the number of real cores. It is analogous to &lquot;hyperthreading&rquote; on Xeon processors. Thus, the 64-core forgo nodes will appear to have 128 CPUs. In many cases this arrangement will increase performance, due to a better scheduling of CPU instructions (e.g. compute vs. memory access).
Unfortunately, the frequent workload-switching in each core also introduces significant communication delays between cores. Only applications that need little communication (e.g. processing of big data sets in well-separated chunks), or algorithms that have not been heavily optimized yet will benefit from hyperthreading.
Modern molecular dynamics (MD) applications are significantly slowed down by hyperthreading, and therefore it is recommended that MD jobs be submitted with:
--ntasks-per-core=1so as to run only 1 process per physical core.
The forgo partition can be used by multi-node, single-node and single-core jobs concurrently. To allocate resources efficiently for jobs that need less than one node, it is possible to request a number of tasks that is not a multiple of 64 cores (or 128 threads).
However, many types of multi-node jobs, and MD simulations in particular, will benefit from a homogeneous allocation of whole nodes:
sbatch --partition=forgo \ --ntasks=<multiple of 64> \ --ntasks-per-core=1 \ --exclusive \ --time=DD-HH:MM:SS \ jobscriptwhere:
--partition=forgo | run this job on the forgo partition only. |
--ntasks= | Number of MPI tasks. |
--ntasks-per-core=1 | Run 1 MPI task per physical core. Ignore hyperthreading |
--time=DD-HH:MM:SS | Set the walltime for this job to DD days, HH hours, MM minutes, SS seconds |
--exclusive | Do not run any other jobs on this node. |
Note: It is possible to submit to two partitions with --partition=forgo,multinode and the batch system will run the job on whichever partition has free nodes. However, for a parallel multinode job, this can cause complications.
Firstly, the multinode partition has nodes of 16 or 28 cores, while the forgo nodes have 20 cores, so that it is not possible to specify an --ntasks value to fully utilize any allocated node. Secondly, if the job runs on the multinode partition, it could end up on a heterogenous set of nodes (x2680, x2650, x2695), and the job would run at the speed of the slowest processor. Normally, when submitting to the multinode partition, it is best to specify a node type with --constraint=x#### to avoid this problem, but if submitting to both forgo and multinode, the job would then be able to run on only the multinode partition since there is no overlap of node types between forgo and multinode.
Thus, it is best to submit parallel multinode jobs to only one partition, either forgo or multinode.
The current per-user core limit on the forgo queue can be seen via the 'batchlim' command.
biowulf% batchlim Partition MaxCPUsPerUser DefWalltime MaxWalltime --------------------------------------------------------------- forgo 11520 1-00:00:00 3-00:00:00
While approved users have priority access to the forgo nodes, they will be accessible by other Biowulf users by virtue of the existence of the "quick" queue. Nodes not in use by forgo users may be allocated for quick queue jobs for up to 4 hours. That is, no forgo job will be queued for more than 4 hours waiting for nodes allocated to quick queue jobs.
To see how many nodes of each type are available use the freen command; there is now a separate section which reports the number of available forgo nodes:
$ freen ........Per-Node Resources........ Partition FreeNds FreeCPUs Cores CPUs Mem Disk Features -------------------------------------------------------------------------------- forgo 56/288 2240/11520 20 40 125g 800g cpu40,core20,g125,ssd800,x2630,ibfdr,forgo
Please send questions and comments to staff@hpc.nih.gov