NIH HPC News & Announcements
NIH Biowulf Phase 3 nodes going offline for emergency remediation
Date: 22 September 2017 12:09:41
From: steven fellini
Biowulf/HPC Users,
A potential hardware failure has been identified in the Phase 3 Biowulf nodes. To prevent sudden node failures in the future, a hardware component on the Phase 3 nodes will be replaced over the next few days. During this process, racks of nodes will be taken offline, rectified, and returned to the cluster. You will see fewer resources available during this period resulting in jobs remaining in Pending state longer than usual. Additionally, some running jobs may need to be requeued so that those nodes can be remediated; in those cases we will notify the owners of the jobs.
Thanks for your patience during this remediation process.
########################################################################
Please contact staff@hpc.nih.gov with any questions about the NIH HPC Systems
[Last 12 months of HPC announcements]