|The Faraldo-Gómez (NHLBI) and Forrest (NINDS) labs have funded 288 nodes (each with 20 physical cores, 40 cpus with hyperthreading) in the Biowulf cluster, and users from those labs have priority access to these nodes. This priority status will last until February 2023. The nodes are in the 'forgo' partition||
Faraldo-Gómez and Forrest lab members who do not already have an HPC account should fill out the account request form at
This form is only accessible from the NIH network or VPN, and requires logging in with your NIH username and password.
Once the PI has approved and the account is set up, all users whose PI is Dr Faraldo-Gómez or Dr Forrest will automatically get priority access to the 'forgo' buyin nodes.
Jobs are submitted to the forgo nodes by specifying the "forgo" partition. In the simplest case,
sbatch --partition=forgo [other Slurm options] your_batch_script
Unlike the multinode partition, the forgo partition is composed by identical nodes, and there is no need to provide a --constraint option.
Most of the Slurm commands and customizations used in other Biowulf partitions (e.g.swarm or sinteractive) are supported on the forgo partitions as well.
Note: the forgo nodes run at a lower clock rate and have fewer cores than most of the other Biowulf nodes. Single-core and single-node jobs will not achieve their highest speed on the forgo nodes.
Hyperthreading is a hardware feature of the Xeon processor that allows each physical core to run two simultaneous threads of execution thereby appearing to double the number of real cores. Thus the 20-core forgo nodes will appear to have 40 cores. In many cases this arrangement will increase performance, due to a better scheduling of CPU instructions (e.g. compute vs. memory access).
Unfortunately, the frequent workload-switching in each core also introduces significant communication delays between cores. Only applications that need little communication (e.g. processing of big data sets in well-separated chunks), or algorithms that have not been heavily optimized yet will benefit from hyperthreading.
Modern molecular dynamics (MD) applications are significantly slowed down by hyperthreading, and therefore it is recommended that MD jobs be submitted with:
--ntasks-per-core=1so as to run only 1 process per physical core.
The forgo partition can be used by multi-node, single-node and single-core jobs concurrently. To allocate resources efficiently for jobs that need less than one node, it is possible to request a number of tasks that is not a multiple of 20 cores (or 40 threads).
However, many types of multi-node jobs, and MD simulations in particular, will benefit from a homogeneous allocation of whole nodes:
sbatch --partition=forgo \ --ntasks=<multiple of 20> \ --ntasks-per-core=1 \ --exclusive \ --time=DD-HH:MM:SS \ jobscriptwhere:
|--partition=forgo||run this job on the forgo partition only.|
|--ntasks=||Number of MPI tasks.|
|--ntasks-per-core=1||Run 1 MPI task per physical core. Ignore hyperthreading|
|--time=DD-HH:MM:SS||Set the walltime for this job to DD days, HH hours, MM minutes, SS seconds|
|--exclusive||Do not run any other jobs on this node.|
Note: It is possible to submit to two partitions with --partition=forgo,multinode and the batch system will run the job on whichever partition has free nodes. However, for a parallel multinode job, this can cause complications.
Firstly, the multinode partition has nodes of 16 or 28 cores, while the forgo nodes have 20 cores, so that it is not possible to specify an --ntasks value to fully utilize any allocated node. Secondly, if the job runs on the multinode partition, it could end up on a heterogenous set of nodes (x2680, x2650, x2695), and the job would run at the speed of the slowest processor. Normally, when submitting to the multinode partition, it is best to specify a node type with --constraint=x#### to avoid this problem, but if submitting to both forgo and multinode, the job would then be able to run on only the multinode partition since there is no overlap of node types between forgo and multinode.
Thus, it is best to submit parallel multinode jobs to only one partition, either forgo or multinode.
The current per-user core limit on the forgo queue can be seen via the 'batchlim' command.
biowulf% batchlim Partition MaxCPUsPerUser DefWalltime MaxWalltime --------------------------------------------------------------- forgo 11520 1-00:00:00 3-00:00:00
While approved users have priority access to the forgo nodes, they will be accessible by other Biowulf users by virtue of the existence of the "quick" queue. Nodes not in use by forgo users may be allocated for quick queue jobs for up to 4 hours. That is, no forgo job will be queued for more than 4 hours waiting for nodes allocated to quick queue jobs.
To see how many nodes of each type are available use the freen command; there is now a separate section which reports the number of available forgo nodes:
$ freen ........Per-Node Resources........ Partition FreeNds FreeCPUs Cores CPUs Mem Disk Features -------------------------------------------------------------------------------- forgo 56/288 2240/11520 20 40 125g 800g cpu40,core20,g125,ssd800,x2630,ibfdr,forgo
Please send questions and comments to firstname.lastname@example.org