Additional information for NCI-CCR users of the NIH Biowulf Cluster

NCI-CCR has funded 48 nodes (1728 physical cores, 3456 cpus with hyperthreading) in the Biowulf cluster, and CCR users have priority access to these nodes. This priority status will last until March 31, 2025 (FY2021 funded nodes).

Note: the MOU for the 2019-funded CCR nodes expired and those nodes were returned to the general pool in Jul 2023.


How to get Access to the CCR nodes

If you do not already have one, get an HPC account. Fill out the account request form at
https://hpc.nih.gov/nih/accounts/account_request.php.
This form is only accessible from the NIH network or VPN, and requires logging in with your NIH username and password.

Once the PI has approved and the account is set up, all users under an NCI-CCR PI will automatically get priority access to the CCR buyin nodes.

Hardware
The hardware characteristics of the compute nodes are as follows:
  • 2 x Intel Xeon Gold 6240, 18-cores @ 2.60GHz, hyperthreading enabled
  • 384 GB memory
  • 3.2 TB SSD (solid-state) disk
  • 100 Gb/s Infiniband
  • Hyperthreading

    Hyperthreading is a hardware feature of the Xeon processor that allows each physical core to run two simultaneous threads of execution thereby appearing to double the number of real cores. Thus the 36-core CCR nodes will appear to have 72 cores. In many cases this will increase the performance of applications that can multi-thread or otherwise take advantage of multiple cores. However, before running 72 threads of execution on a single node, the Biowulf staff recommends that you benchmark your application to determine whether it can take advantage of hyperthreading or not. (Or even whether it scales to 36 cores!).

    Submitting jobs to the batch system

    Jobs are submitted to the CCR nodes by specifying the "ccr" partitions. In the simplest case,

    sbatch --partition=ccr your_batch_script
    

    Submitting a job requiring 128 GB of local scratch,

    sbatch --partition=ccr --gres=lscratch:128 your_batch_script
    

    To submit a swarm of jobs,

    swarm -f command_file --partition ccr
    

    Note that jobs submitted to the CCR partition will not run on non-CCR nodes. If there are no CCR nodes available, the job will remain queued until CCR nodes become free (note: you may also specify "--partition=ccr,norm").

    Core Limits

    The current per-user core limit on the CCR queue can be seen via the 'batchlim' command.

    biowulf% batchlim
    Partition        MaxCPUsPerUser     DefWalltime     MaxWalltime
    ---------------------------------------------------------------
    ccr                      3072         04:00:00     10-00:00:00 
    
    

    Node Availablity

    While approved CCR users have priority access to the CCR nodes, they will be accessible by other Biowulf users by virtue of the existence of a "short" queue. Nodes not in use by CCR users may be allocated for short queue jobs for up to 4 hours. That is, no CCR job will be queued for more than 4 hours waiting for nodes allocated to short queue jobs.

    To see how many nodes of each type are available use the freen command; there is now a separate section which reports the number of available CCR nodes:

    $ freen
                                               ........Per-Node Resources........  
    Partition    FreeNds       FreeCPUs        Cores CPUs   Mem   Disk    Features
    --------------------------------------------------------------------------------
    ccr             0 / 48     1086 / 3456                 36    72         373g  3200g cpu72,core36,g384,ssd3200,x6240,ibhdr100,ccr
    
    


    CCR Partitions - Last 24 hrs

    CCR Partitions - Last month

    CCR Partitions - Last year

    Please send questions and comments to staff@hpc.nih.gov