Next Walk-In Consults The Walk-In Consults are virtualized for the present. All problems and concerns are welcome, from scripting problems to node allocation, to strategies for a particular project, to anything that is affecting your use of the HPC systems. The Zoom details are emailed to all Biowulf users the week of the consult. Wed Sep 25, 2024, 1 - 3 pm Zoom-In
Wed Oct 16, 2024, 1 - 3 pm Zoom-In
Wed Nov 13, 2024, 1 - 3 pm Zoom-In If you can't make the Walk-In/Zoom-In, no worries! Send us an email (staff@hpc.nih.gov) requesting a consult. We will schedule a 1-on-1 phone call or webcast session with one of the staff. |
No registration is required for these classes.
Bash Shell Scripting A series of short videos (5-15 minutes) that explore various aspects of Bash from shell setup and command line usage to advanced topics in scripting. The videos can be watched in any order although we provide them in somewhat logical order. In development ... new videos coming each week! Click here to get to the class. |
Introduction to Biowulf Instructors: NIH HPC Staff
Video tutorials and hands-on exercises to learn about all aspects of the NIH Biowulf cluster including submitting and monitoring jobs.
New Biowulf users are encouraged to work through the entire class, and experienced Biowulf users can view specific videos to brush up on a particular section. Click here to get to the class. |
A practical introduction to GATK 4 on Biowulf A tutorial with a case study and benchmark testing on each step that will help Biowulf users run GATK 4 efficiently. The tutorial is based on germline variant discovery with WGS data from a trio. You will learn NGS data preprocessing and how to optimize your scripts. Qi Yu and Wolfgang Resch, May 27 2021 Click here to get to the GATK 4 tutorial. |
4 Feb 2020: [Videocast]
The Kinetics of Gene Transcription
Carson Chow (NIDDK)
Apart from the handouts listed below, the NIH HPC staff creates and maintains Training Videos to help users get the most out of our resources.
Deep Learning by Example on Biowulf. A course comprising a series of hands-on biological examples implemented in Keras, one example per class, intended for NIH researchers using Biowulf. Gennady Denisov, 2019-2024
NIH Globus Workshop. Sponsored by NEI, Aug 22-23. [Schedule] [Day 1 recording] [Day 2 recording]
R on Biowulf
(Slides)
(Video)
These case studies will teach users how to use R on the Biowulf Cluster. We will focus on 1) migrating R packages from your laptop to HPC cluster; 2) managing your own R packages vs using system packages; 3) speeding up R scripts with parallel computing. There will be hands-on activities/troubleshooting at the end of the tutorial. Class will be webcast.
Qi Yu and Wolfgang Resch, Nov 2022
Matlab on the Biowulf Cluster
(Slides)
(Video)
An introduction to Matlab on the Biowulf cluster. This course covers: (1) brief review of the Biowulf cluster; (2) running Matlab interactively; (3) running Matlab scripts as batch jobs using sbatch and swarm; and (4) Limits, pitfalls and caveats. This class requires basic knowledge of Matlab and the Linux command line.
Antonio Ulloa, 11 July 2022
Julia for Scientific Computing
(Slides)
(Video)
This is an introductory course to scientific computing in Julia. The class covers: brief history of Julia, trends in recent Julia usage, comparison of Julia with other scientific computing languages, overview of Julia scientific stack, pros and cons of using Julia, Julia's IDEs, Julia package installation, and interactive Julia script execution on Biowulf.
Antonio Ulloa, 7 April 2021
Overview of Revision Control with Git
(Slides)
Presentation to the Laboratory of Epidemiology & Population Science
Afif Elghraoui, 19 March 2021
Biowulf Overview and Best Practices
(Slides)
Presentation to the NIH Data Science Team
Nitish Narula and Wolfgang Resch, 12 March 2021
Biowulf Overview and Best Practices
(Slides)
(Video)
Presentation to the NHLBI Data Science and Bioinformatics group
Wolfgang Resch, 8 March 2021
Making Effective Use of Storage (PDF)
Making effective use of the storage systems provided for HPC and how to avoid common pitfalls that
can slow down your jobs and cause system problems.
Tim Miller, 21 May 2020
Introduction to Linux Containers with Singularity
(Tutorial GitHub repo)
(WebEx recording (Day 1))
(WebEx recording (Day 2))
This class was taught in 2, 3-hour sessions. Students were provided with access
to disposable virtual machines (Ubuntu 18.04 on GCP). Students learned what a
Linux container is, how it is similar to and differs from a virtual machine, and
a bit about leading container platforms and their intended usage. This was
followed by hands-on instruction on how to install Singularity and use it to run
existing containers from Docker Hub and the Sylabs Container Library. Next
students learned how to build containers from scratch and publish them so that
others can use them. We finished the class with some advanced examples showing
how to tightly integrate container environments with the environment on the host
system and how to "fake" the installation of an entire application on the host
system through Singularity.
David Godlove, 10-11 March 2020
Python in HPC (PDF)
Profiling, optimizing, and executing python in high performance computing.
Wolfang Resch, 20 February 2020
Bash Shell Scripting
(PDF) (PPT) (Webex MP4)
The default shell on many Linux systems is bash. Bash shell scripting provides a method for automating common tasks on Linux systems (such as Helix and Biowulf), including transferring and parsing files, creating sbatch and swarm scripts, pipelining tasks, and monitoring jobs. A step-by-step data-driven lesson is available. There is a summary of Linux commands (PDF) and the GNU Bash manual (PDF) available as well.
David Hoover, 17-18 December 2019
Managing Personal Software Installations (PDF)
Scientific software can be challenging to get working. This hands-on workshop will cover private software installations,
dealing with various package managers and build systems, and organizing software in your personal space on Biowulf.
Afif Elghraoui, 06 August 2019
Scientific Python for Matlab users
(PDF)
(Video)
An introduction to scientific Python for those with some experience in Matlab. This course contains a comparison of Python and Matlab as well as an overview of Python's scientific stack (numpy, scipy, matplotlib). Also covered are a brief description of IDEs for Python and how to make use of Jupyter notebooks on the Biowulf cluster.
Antonio Ulloa, 9 July 2019
Data Management Best Practices for Groups
(PDF)
(PPT)
An overview of file storage, access permissions, file transfer, and sociological behaviors to keep your collaborative group functioning.
David Hoover, 24 April 2019
NIH HPC Object Storage System Overview
(PDF)
This course introduces users to the concept of object storage - a new technology being used by many large Internet companies that is becoming increasingly popular for scientific use because of its capability to store large-scale, unstructured data. The class describes the NIH HPC object storage system in detail and includes a practical example of its use in a real scientific workflow.
Tim Miller, 9 Oct 2018
Creating and running software containers with Singularity
(slides and tutorial)
This was a three hour hands-on workshop on how to user Singularity to create and run containers. Students learned how to install Singularity on a Linux system, create containers with their choice of Linux distribution and software, and use Singularity to run containerized apps.
Afif Elghraoui, 26 July 2018
Building a reproducible workflow with Snakemake and Singularity
(Slides and
GitHub repo)
Students attending this class will learn how to build a workflow with Snakemake
and how to make it more reproducible with Singularity containers. This class
will make use of the Biowulf cluster and requires knowledge of the Linux
command line as well as Python.
Wolfgang Resch, 21 February 2018
Singularity Demo( Slides | Demo Readme)
1h intro and demo of Singularity containers for NIMH Reproducible Neuroscience Workshop
Using the HPC Systems Storage Effectively
(PDF)
A course that describes the different storage systems available to NIH HPC users along with policies and best practices. Also explains how to avoid storage bottlenecks when running jobs.
Tim Miller, 16 Feb 2017
Parallel MATLAB jobs on Biowulf
(PDF)
(PPT)
(Videos)
Developing MATLAB code for parallel computing, using the MATLAB compiler to deploy license-free code, automating swarm file generation, spawning and monitoring swarms interactively from within the MATLAB environment, and ordering jobs with dependencies to develop an analysis pipeline.
Dave Godlove, 17 Feb 2016
Swarm on the Biowulf Cluster
(PDF)
(PPT)
Swarm is a script designed to simplify submitting a group of commands to the Biowulf cluster. With the shift from PBS to Slurm, the functionalities and indiosycrasies of swarm have changed.
David Hoover, 22 Sep 2015
Gene Synthesis using DNAWorks
(PPT)
David Hoover, 15 Nov 2006
Helix and Biowulf users will make most effective use of the systems if they are familiar with GNU/Linux.
Below are links to some tutorials which cover the basics of GNU/Linux commands.
Introduction to Linux at the TACC, Texas.
A Basic Unix/Linux Tutorial, at Oxford University, UK.
Command-line crash course
The Linux Command Line, by William Shotts
Learn the Command Line, A web-based GUI tutorial by codeacademy
Ryan's Linux Tutorial