Additional information for Faraldo-Gómez and Forrest lab users

The Faraldo-Gómez (NHLBI) and Forrest (NINDS) labs have funded 84 nodes (72 CPU nodes with 64 physical cores, 12 GPU nodes with 4 A100 GPUs each) in the Biowulf cluster, and users from those labs have priority access to these nodes. This priority status will last until October, 2028. The nodes are in the 'forgo' partition

How to get Access to the 'Forgo' nodes

Faraldo-Gómez and Forrest lab members who do not already have an HPC account should fill out the account request form at
https://hpc.nih.gov/nih/accounts/account_request.php.
This form is only accessible from the NIH network or VPN, and requires logging in with your NIH username and password.

Once the PI has approved and the account is set up, all users whose PI is Dr Faraldo-Gómez or Dr Forrest will automatically get priority access to the 'forgo' buyin nodes.

Hardware
The hardware characteristics of each CPU compute node are as follows:
  • 2 x AMD(R) Epyc(R) CPU 7543, 32 cores @ 2.80GHz, SMT enabled (total 64 cores, 128 CPUs)
  • 256 GB memory
  • 100 Gb/s Infiniband (HDR100)
  • The hardware characteristics of each GPU compute node are as follows:

  • 1 x AMD(R) Epyc(R) CPU 7543, 32 cores @ 2.80GHz, SMT enabled (total 64 CPUs)
  • 256 GB memory
  • 200 Gb/s Infiniband (HDR200)
  • 4 x Nvidia(R) A100 GPU
  • Submitting jobs to the batch system

    Jobs are submitted to the forgo nodes by specifying the "forgo" partition. In the simplest case,

    sbatch    --partition=forgo    [other Slurm options]   your_batch_script
    

    Since the forgo partition is composed of both CPU and GPU nodes, please use

    --constraint=e7543
    when requesting CPU-only nodes to ensure that CPU-only jobs do not run on GPU nodes. GPU jubs should use the
    --gres
    flag to request GPUs, similar to Biowulf's GPU partition. For example,
    --gres=gpu:a100:2
    will request 2 A100 GPUs,

    Most of the Slurm commands and customizations used in other Biowulf partitions (e.g.swarm or sinteractive) are supported on the forgo partitions as well.

    Hyperthreading

    Symmetric multithreading is a hardware feature of the AMD Epyv processor that allows each physical core to run two simultaneous threads of execution thereby appearing to double the number of real cores. It is analogous to &lquot;hyperthreading&rquote; on Xeon processors. Thus, the 64-core forgo nodes will appear to have 128 CPUs. In many cases this arrangement will increase performance, due to a better scheduling of CPU instructions (e.g. compute vs. memory access).

    Unfortunately, the frequent workload-switching in each core also introduces significant communication delays between cores. Only applications that need little communication (e.g. processing of big data sets in well-separated chunks), or algorithms that have not been heavily optimized yet will benefit from hyperthreading.

    Modern molecular dynamics (MD) applications are significantly slowed down by hyperthreading, and therefore it is recommended that MD jobs be submitted with:

    --ntasks-per-core=1
    
    so as to run only 1 process per physical core.

    Allocating cores for multi-node jobs

    The forgo partition can be used by multi-node, single-node and single-core jobs concurrently. To allocate resources efficiently for jobs that need less than one node, it is possible to request a number of tasks that is not a multiple of 64 cores (or 128 threads).

    However, many types of multi-node jobs, and MD simulations in particular, will benefit from a homogeneous allocation of whole nodes:

    sbatch --partition=forgo \	
          --ntasks=<multiple of 64> \
          --ntasks-per-core=1 \
          --exclusive  \
          --time=DD-HH:MM:SS \
          jobscript
    
    where:
    --partition=forgo run this job on the forgo partition only.
    --ntasks= Number of MPI tasks.
    --ntasks-per-core=1 Run 1 MPI task per physical core. Ignore hyperthreading
    --time=DD-HH:MM:SSSet the walltime for this job to DD days, HH hours, MM minutes, SS seconds
    --exclusive Do not run any other jobs on this node.
    Without the --exclusive flag, some cores may be allocated on nodes where the network interface and the file system are being heavily used by other jobs, affecting the performance of the entire multi-node job.

    Note: It is possible to submit to two partitions with --partition=forgo,multinode and the batch system will run the job on whichever partition has free nodes. However, for a parallel multinode job, this can cause complications.
    Firstly, the multinode partition has nodes of 16 or 28 cores, while the forgo nodes have 20 cores, so that it is not possible to specify an --ntasks value to fully utilize any allocated node. Secondly, if the job runs on the multinode partition, it could end up on a heterogenous set of nodes (x2680, x2650, x2695), and the job would run at the speed of the slowest processor. Normally, when submitting to the multinode partition, it is best to specify a node type with --constraint=x#### to avoid this problem, but if submitting to both forgo and multinode, the job would then be able to run on only the multinode partition since there is no overlap of node types between forgo and multinode.
    Thus, it is best to submit parallel multinode jobs to only one partition, either forgo or multinode.

    Core Limits

    The current per-user core limit on the forgo queue can be seen via the 'batchlim' command.

    biowulf% batchlim
    
    Partition        MaxCPUsPerUser     DefWalltime     MaxWalltime
    ---------------------------------------------------------------
    forgo                   11520       1-00:00:00      3-00:00:00
    
    

    Node Availablity

    While approved users have priority access to the forgo nodes, they will be accessible by other Biowulf users by virtue of the existence of the "quick" queue. Nodes not in use by forgo users may be allocated for quick queue jobs for up to 4 hours. That is, no forgo job will be queued for more than 4 hours waiting for nodes allocated to quick queue jobs.

    To see how many nodes of each type are available use the freen command; there is now a separate section which reports the number of available forgo nodes:

    $ freen
                                               ........Per-Node Resources........  
    Partition    FreeNds       FreeCPUs        Cores CPUs   Mem   Disk    Features
    --------------------------------------------------------------------------------
    forgo         56/288         2240/11520        20    40    125g   800g   cpu40,core20,g125,ssd800,x2630,ibfdr,forgo
    
    

    FORGO Partition Usage - Last 24 hrs

    FORGO Partition Usage - Last month

    FORGO Partition Usage - Last year

    Please send questions and comments to staff@hpc.nih.gov