High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Upcoming Classes

Classes taught by the HPC staff are free, but registration is required for some classes. Priority is given to Helix/Biowulf users.
Even if there are no seats available in a class, you can still register and will be put on the waitlist. If someone else drops out, you will be automatically registered for the class and will receive an email.

The NIH HPC object store
Instructor(s): Tim Miller (NIH HPC Staff)
Location: Bldg 12, Rm B51, Date/Time: Tue Jan 09, 2018, 9 am - 1 pm

[Class Description]

Object storage is a relatively new technology used to provide large and reliable storage systems. Most prominently used by companies such as Microsoft, Amazon, and Google as the back-end to their Web applications, object storage is also becoming more popular for scientific computing - especially in the life sciences. Researchers in genomics, electron microscopy, and other biological disciplines can make effective use of this type of storage. This course will introduce users to the concepts behind object storage and will provide practical examples of how to use the NIH HPC object storage system from the Biowulf cluster. We will focus on specific examples and hands-on work with the object store.
Register
(NIH login required)
Relion tips and tricks, and Parallel jobs and benchmarking
Instructor(s): David Hoover/Jerez Te (NIH HPC Staff)
Location: Bldg 50, Rm 1227, Date/Time: Tue Jan 16, 2018, 1 pm - 3 pm

[Class Description]

Mechanics and best practices for submiting RELION jobs to the batch system from both the command line and via the RELION GUI, as well as methods for monitoring and evaluating the results. Scaling of parallel jobs, how to benchmark to make effective use of your allocated resources
Seminar
No registration required
Introduction to Linux
Instructor(s): Ainsley Gibson (NIH HPC Staff)
Location: Bldg 12, Rm B51, Date/Time: Tue Jan 16, 2018 - Wed Jan 17, 2018, 9:00 AM - 1:00 PM

[Class Description]

This course is intended for researchers that are new to Linux/UNIX. The course will cover Linux operating system concepts, a little history, basic file system navigation, text editing, bash (shell) syntax, file transfer, and a number of useful commands and utilities.
Register
(NIH login required)

Other Training at NIH

The Technology Training Program at CIT offers courses relating to computing, networks, and information systems.

The NIH training program offers classes in writing, speaking, grant writing and more.

The FAES Graduate School runs short and long courses.

Slides and Handouts from Previous HPC Classes

Apart from the handouts listed below, the NIH HPC staff creates and maintains Training Videos to help users get the most out of our resources.

The NIH Biowulf Cluster: Scientific Supercomputing (PDF)
This two-part class is an introduction to the Biowulf Linux cluster for users who have NIH Biowulf accounts or Helix users planning to get one. Topics covered: cluster concepts, accounts, connection, storage, batch system, how to set up and submit a simple batch job, partitions, interactive jobs, swarm jobs, available scientific applications, job monitoring, resource allocation, licensed software.
Steven Fellini and Susan Chacko, 11 Dec 2017

Python in HPC (Slides and GitHub repo)
Overview of python tools used in high performance computing, and how to improve the performance of your python code.
Wolfgang Resch, 30 November 2017

Creating and running software containers with Singularity (tutorial)
This was a three hour hands-on workshop on how to user Singularity to create and run containers. Students learned how to install Singularity on a Linux system, create containers with their choice of Linux distribution and software, and use Singularity to run containerized apps.
Afif Elghraoui, 06 November 2017

Effective Use of the Biowulf Batch System and Storage Systems
(Batch System (PDF) and Storage System (PDF)
Find out how to make best use of the cluster, how to get your jobs to start sooner, how to utilize available resources to speed up the I/O of your jobs and more!
Steve Fellini, Mark Patkus, Tim Miller, 30 Oct 2017

Bash Shell Scripting (PDF) (PPT)
The default shell on many Linux systems is bash. Bash shell scripting provides a method for automating common tasks on Linux systems (such as Helix and Biowulf), including transferring and parsing files, creating sbatch and swarm scripts, pipelining tasks, and monitoring jobs. A step-by-step data-driven lesson is available. There is a summary of Linux commands (PDF) available as well.
David Hoover, 19-20 October 2017

Introduction to Linux (PDF)
Course intended for researchers that are new to Linux/Unix. Covers Unix/Linux operating system concepts, a little history, basic file system navigation, text editing, bash (shell) syntax, file transfer, and a number of useful commands and utilities.
Ainsley Gibson, CSRA/HPC Staff, 19-20 Sep 2017

Singularity Demo( Slides |  Demo Readme)
1h intro and demo of Singularity containers for NIMH Reproducible Neuroscience Workshop

Building a reproducible workflow with Snakemake and Singularity. (course materials)
Hands-on introduction to snakemake. Also included a demonstration on how to integrate singularity containers with snakemake to create a reproducible workflow.
Wolfgang Resch, May 1, 2017

NIH HPC Object Storage System Overview (PDF) (PPT)
This course introduces users to the concept of object storage - a new technology being used by many large Internet companies that is becoming increasingly popular for scientific use because of its capability to store large-scale, unstructured data. The class describes the NIH HPC object storage system in detail and includes a practical example of its use in a real scientific workflow.
Tim Miller, 28 Feb 2017

Using the HPC Systems Storage Effectively (PDF)
A course that describes the different storage systems available to NIH HPC users along with policies and best practices. Also explains how to avoid storage bottlenecks when running jobs.
Tim Miller, 16 Feb 2017

Parallel MATLAB jobs on Biowulf (PDF) (PPT) (Videos)
Developing MATLAB code for parallel computing, using the MATLAB compiler to deploy license-free code, automating swarm file generation, spawning and monitoring swarms interactively from within the MATLAB environment, and ordering jobs with dependencies to develop an analysis pipeline.
Dave Godlove, 17 Feb 2016

Swarm on the Biowulf Cluster (PDF) (PPT)
Swarm is a script designed to simplify submitting a group of commands to the Biowulf cluster. With the shift from PBS to Slurm, the functionalities and indiosycrasies of swarm have changed.
David Hoover, 22 Sep 2015

Imputing Big Data from GWAS (PDF) (PPT)
A discussion of imputation and large scale meta-analyses of GWAS data on the biowulf cluster featuring hands on examples and experienced instruction.
Michael Nalls (NIA), 10 Sep 2014, 1 Oct 2014

Rosetta Workshop
Tutorials and presentations from the Rosetta Design Group, hosted by Helix Systems. (NIH only)
Xavier Ambroggio and Monica Berrondo, 19-21 May 2009

Gene Synthesis using DNAWorks (PPT)
David Hoover, 15 Nov 2006

Linux Tutorials

Helix and Biowulf users will make most effective use of the systems if they are familiar with GNU/Linux.

Below are links to some tutorials which cover the basics of GNU/Linux commands.

Introduction to Linux at the TACC, Texas.
Introduction to Linux guide located on the the Linux Documentation Project's website.
A Basic Unix/Linux Tutorial, at Oxford University, UK.
Unix Tutorial for Beginners. Eight simple tutorials which cover the basics of Unix, at U. Surrey, UK.
Command-line crash course
The Linux Command Line, by William Shotts
Learn the Command Line, A web-based GUI tutorial by codeacademy