Biowulf High Performance Computing at the NIH
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
17 Sep 2021 trimgalore updated to version 0.6.7
Consistent quality and adapter trimming for RRBS or standard FastQ files.
17 Sep 2021 spades updated to version 3.15.3
SPAdes – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies.
16 Sep 2021 Solar updated to version 8.5.1
SOLAR-Eclipse is an extensive, flexible software package for genetic variance components analysis, including linkage analysis, quantitative genetic analysis, SNP association analysis (QTN and QTLD), and covariate screening.
16 Sep 2021 FSL updated to version 6.0.5
FSL is a comprehensive library of image analysis and statistical tools for FMRI, MRI and DTI brain imaging data.
15 Sep 2021 qsiprep updated to version 0.14.2
qsiprep configures pipelines for processing diffusion-weighted MRI (dMRI) data.
15 Sep 2021 bedops updated to version 2.4.40
Bedops is a suite of tools to address common questions raised in genomic studies - mostly with regard to overlap and proximity relationships between data sets - BEDOPS aims to be scalable and flexible, facilitating the efficient and accurate analysis and management of large-scale genomic data.
15 Sep 2021 csvkit updated to version 1.0.6
csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.
14 Sep 2021 subread updated to version 2.0.3
High-performance read alignment, quantification and mutation discovery
14 Sep 2021 Schrodinger updated to version 2021.3
A limited number of Schrödinger applications are available on the Biowulf cluster through the Molecular Modeling Interest Group. Most are available through the Maestro GUI.
14 Sep 2021 bpipe updated to version 0.9.11
Bpipe provides a platform for running big bioinformatics jobs
13 Sep 2021 shapeit updated to version 4.2.2
SHAPEIT is a fast and accurate haplotype inference software
13 Sep 2021 RELION updated to version 3.1.3
RELION (for REgularised LIkelihood OptimisatioN) is a stand-alone computer program for Maximum A Posteriori refinement of (multiple) 3D reconstructions or 2D class averages in cryo-electron microscopy.
13 Sep 2021 pbipa updated to version 1.3.1
Improved Phased Assembler (IPA) is the official PacBio software for HiFi genome assembly. IPA was designed to utilize the accuracy of PacBio HiFi reads to produce high-quality phased genome assemblies.
13 Sep 2021 SimNIBS updated to version 3.2.4
SimNIBS is a free software package for the Simulation of Non-invasive Brain Stimulation. It allows for realistic calculations of the electric field induced by transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS).
9 Sep 2021 3DChromatin_ReplicateQC updated to version 51a7afb5
Measures the quality and reproducibility of 3D genome data.
9 Sep 2021 Bartender updated to version 1.1
Bartender is an accurate clustering algorithm to detect barcodes and their abundances from raw next-generation sequencing data. In contrast with existing methods that cluster based on sequence similarity alone, Bartender uses a modified two-sample proportion test that also considers cluster size. This modification results in higher accuracy and lower rates of under- and over-clustering artifacts.
8 Sep 2021 fmriprep updated to version 21.0.0rc0
A Robust Preprocessing Pipeline for fMRI Data
1 Sep 2021 bwa-mem2 updated to version 2.2.1
The next version of the bwa-mem algorithm in bwa.
31 Aug 2021 drep updated to version 3.2.2
dRep is a python program for rapidly comparing large numbers of genomes. dRep can also "de-replicate" a genome set by identifying groups of highly similar genomes and choosing the best representative genome for each genome set.
30 Aug 2021 GIGI updated to version 1.05
GIGI (Genotype Imputation Given Inheritance) implements an approach that enables computationally efficient imputation in large pedigrees. It samples inheritance vectors (IVs) from a Markov Chain Monte Carlo sampler by conditioning on genotypes from a sparse set of framework markers. Missing genotypes are probabilistically inferred from these IVs along with observed dense genotypes that are available on a subset of subjects.
26 Aug 2021 GEMMA updated to version 0.98.5
GEMMA is the software implementing the Genome-wide Efficient Mixed Model Association algorithm for a standard linear mixed model and some of its close relatives for genome-wide association studies (GWAS).
25 Aug 2021 ncbi-toolkit updated to version 25.0.0
The NCBI C++ Toolkit is a set of executables and libraries for a multitude of sequence analysis functions.
24 Aug 2021 genometools updated to version 1.6.2
collection of bioinformatic tools
20 Aug 2021 vartrix updated to version 1.1.22
VarTrix is a software tool for extracting single cell variant information from 10x Genomics single cell data.
19 Aug 2021 CHARMM updated to version c46b1
CHARMM is a general and flexible software application for modeling the structure and behavior of molecular systems.
19 Aug 2021 git updated to version 2.33.0
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
17 Aug 2021 campari updated to version 4.0
Campari is a molecular simulations application that supports Monte Carlo and molecular dynamics simulations of biomolecules. It has robust support for a number of different methods, including in silico docking.
17 Aug 2021 ARDISS updated to version 0..1.3
ARDISS is a method to impute missing summary statistics in mixed-ethnicity cohorts through Gaussian Process Regression and automatic relevance determination. ARDISS is trained on an external reference panel and does not require information about allele frequencies of genotypes from the original study.
17 Aug 2021 sratoolkit updated to version 2.11.1
The NCBI SRA Toolkit enables reading ("dumping") of sequencing files from the SRA database and writing ("loading") files into the .sra format.
17 Aug 2021 ncbi-vdb updated to version 2.11.1
The SRA Toolkit and SDK from NCBI is a collection of tools and libraries for using data in the INSDC Sequence Read Archives.
17 Aug 2021 ncbi-ngs updated to version 2.11.1
NCBI's NGS is a new, domain-specific API for accessing reads, alignments and pileups produced from Next Generation Sequencing
17 Aug 2021 nibabies updated to version 0.1.2
Preprocessing pipeline for neonate and infant MRI.
17 Aug 2021 bamtools updated to version 2.5.2
BamTools provides a fast, flexible C++ API & toolkit for reading, writing, and manipulating BAM files.
17 Aug 2021 picard updated to version 2.25.7
Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.
17 Aug 2021 stringtie updated to version 2.1.7b
StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It is primarily a genome-guided transcriptome assembler, although it can borrow algorithmic techniques from de novo genome assembly to help with transcript assembly.
12 Aug 2021 rgt updated to version 0.13.2
Regulatory Genomics Toolbox: Python library and set of tools for the integrative analysis of high throughput regulatory genomics data. http://www.regulatory-genomics.org
10 Aug 2021 circaidme updated to version 0.1.0
circaidme is a tool designed to analyze data generated with CircAID-p-seq for Oxford Nanopore Technologies. It trimmed known adapter sequences used by CircAID-p-seq kit for nanopore reads.
10 Aug 2021 minimap2 updated to version 2.22
Minimap2 is a fast sequence mapping and alignment program that can find overlaps between long noisy reads, or map long reads or their assemblies to a reference genome optionally with detailed alignment (i.e. CIGAR).
9 Aug 2021 MySQL updated to version 8.0.26
MySQL is an open-source relational database management system.
5 Aug 2021 Genome Browser updated to version 418
The Genome Browser Mirror Fragments is a mirror of the UCSC Genome Browser. The URL is https://hpcnihapps.cit.nih.gov/genome. Users can also access the MySQL databases, supporting files directly, and a huge number of associated executables.
5 Aug 2021 PartekFlow updated to version 10.0.21.0801
Web interface designed specifically for the analysis needs of next generation sequencing applications including RNA, small RNA, and DNA sequencing.
4 Aug 2021 distiller-nf updated to version 0.3.3
A modular Hi-C mapping pipeline for reproducible data analysis, it was used for Micro-C analysis too.
4 Aug 2021 CADD updated to version 1.6.post1
CADD (Combined Annotation Dependent Depletion) is a tool for scoring the deleteriousness of single nucleotide variants as well as insertion/deletions variants in the human genome. Currently, it supports the builds: GRCh37/hg19 and GRCh38/hg38.
3 Aug 2021 Dali updated to version 5.1
The three-dimensional co-ordinates of each protein are used to calculate residue - residue distance matrices.
2 Aug 2021 sambamba updated to version 0.8.1
Sambamba is a high performance modern robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files. Current parallelised functionality is an important subset of samtools functionality, including view, index, sort, markdup, and depth.
2 Aug 2021 GATK updated to version 4.2.1.0
GATK, from the Broad Institute, is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.
30 Jul 2021 azcopy updated to version 10.11.0
a command-line utility to copy blobs or files to or from Azure storage
29 Jul 2021 EM-GAN updated to version 20210719
EM-GAN is a tool for post-processing Electron Microscopy maps. It uses Generative Adversarial Networks to improve the resolution of the maps.
29 Jul 2021 mixer updated to version 1.3
MiXeR is Causal Mixture Model for GWAS summary statistics. The version(1.3) contains a Python port of MiXeR, wrapping the C/C++ core. Also data preprocessing code sumstats.py is included too.
26 Jul 2021 fragpipe updated to version 16.0
FragPipe is a Java Graphical User Interface (GUI) for a suite of computational tools enabling comprehensive analysis of mass spectrometry-based proteomics data. It is powered by MSFragger.
26 Jul 2021 DeepMM updated to version 20210722
DeepMM implements fully automated de novo structure modeling method, MAINMAST, which builds three-dimensional models of a protein from a near-atomic resolution EM map. The method directly traces the protein’s main-chain and identifies Cα positions as tree-graph structures in the EM map.
23 Jul 2021 CutRunTools2 updated to version 2.0.0
CutRunTools2 is a major update of CutRunTools2, including a set of new features specially designed for CUT&RUN and CUT&Tag experiments. Both of the bulk and single-cell data can be processed, analyzed and interpreted.
23 Jul 2021 parallel updated to version 20210722
GNU parallel is a shell tool for executing jobs in parallel using one or more computers.
23 Jul 2021 RoseTTAFold updated to version 1.0.0
Accurate prediction of protein structures and interactions using a 3-track network, , in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated.
21 Jul 2021 aslprep updated to version 0.2.8-beta
Aslprep is an application for preprocessing of ASL (arterial spin labeling) data and computation of CBF (cerebral blood flow). Aslprep is a pipeline that uses AFNI, FSL, ANTs, and freesurfer.
21 Jul 2021 xcpengine updated to version 1.2.4
xpcEngine performs denoising and estimation of Functional Connectivity on fMRI datasets
20 Jul 2021 gtex_rnaseq updated to version V8
This module makes available the tools used in the GTEX RNA-Seq pipeline.
20 Jul 2021 PyRosetta updated to version 289.py3.7
PyRosetta is an interactive Python-based interface to the powerful Rosetta molecular modeling suite. It enables users to design their own custom molecular modeling algorithms using Rosetta sampling methods and energy functions.
20 Jul 2021 cnvkit updated to version 0.9.9
Copy number variant detection from targeted DNA sequencing
19 Jul 2021 alphafold2 updated to version 2.0.0-1-gd26287e
This package provides an implementation of the protein structure inference pipeline of AlphaFold v2.0.
19 Jul 2021 novocraft updated to version 4.03.03
Package includes aligner for single-ended and paired-end reads from the Illumina Genome Analyser. Novoalign finds global optimum alignments using full Needleman-Wunsch algorithm with affine gap penalties.
19 Jul 2021 dyno updated to version 20210709
dyno is a meta package that installs several other packages from the dynvers (https://github.com/dynverse). It comprises a set of R packages to construct and interpret single-cell trajectories.
15 Jul 2021 dcm2niix updated to version 1.0.20210317
DICOM to NIfTI converter
15 Jul 2021 pigz updated to version 2.6
pigz (parallel implementation of gzip) is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data.
14 Jul 2021 snakemake updated to version 6.5.3
Snakemake aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style. It is well suited for bioinformatic workflows.
14 Jul 2021 samtools updated to version 1.13
The samtools package now provides samtools, bcftools, tabix, and the underlying htslib library.
14 Jul 2021 uropa updated to version 4.0.2
UROPA is a command line based tool for genomic region annotation
13 Jul 2021 whatshap updated to version 1.1
WhatsHap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.
8 Jul 2021 Rstudio updated to version 1.4.1717
RStudio is a set of integrated tools designed to help you be more productive with R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management.
8 Jul 2021 IGVTools updated to version 2.10.0
IGVTools provides utilities for working with ascii file formats used by the Integrated Genome Viewer. The files can be sorted, tiled, indexed, and counted.
8 Jul 2021 IGV updated to version 2.10.0
The Integrative Genomics Viewer is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.
8 Jul 2021 fitlins updated to version 0.9.1
Fitlins fits linear models to BIDS neuroimaging datasets.
8 Jul 2021 hap.py updated to version 0.3.14
A set of programs based on htslib to benchmark variant calls against gold standard truth datasets.
7 Jul 2021 spaceranger updated to version 1.3.0
10x pipeline for processing Visium spatial RNA-seq data
7 Jul 2021 Mathematica updated to version 12.3.1
Mathematica is an interactive system for doing mathematical computation. It performs numerical, symbolic and graphical computations, and incorporates a high-level programming language.
7 Jul 2021 PyPy updated to version 3.7-7.3.5
PyPy is a fast, compliant alternative implementation of the Python language.
7 Jul 2021 multiqc updated to version 1.11
aggregates results for various frequently used bioinformatics tools across multiple samples into a nice visual report
6 Jul 2021 connectome-workbench updated to version 1.5.0
Tools to browse, download, explore, and analyze data from the Human Connectome Project (HCP). Allows users to compare their own data to that of the HCP.
30 Jun 2021 R updated to version 4.1.0
R (the R Project) is a language and environment for statistical computing and graphics. R is similar to S, and provides a wide variety of statistical and graphical techniques (linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, ...).
30 Jun 2021 nodejs updated to version 14.17.1
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. module name: nodejs
29 Jun 2021 ABC updated to version 0.2.2
The Activity-by-Contact (ABC) model predicts which enhancers regulate which genes on a cell type specific basis.
29 Jun 2021 cluster3 updated to version 1.59
cluster3 is a multipurpose open-source library of C routines, callable from other C and C++programs. It implements k-means clustering, hierarchical clustering and self-organizing maps and provides several unique analytical approaches. In addition, it includes a Python and a Perl interface to the C Clustering Library, thereby combining the flexibility of a scripting language with the speed of C.
28 Jun 2021 MUMmer updated to version 4.0.0rc1
Mummer is a system for aligning entire genomes extremely rapidly.
28 Jun 2021 QIIME updated to version 2-2021.4
QIIME is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rRNA) generated on a variety of platforms, but also supporting analysis of other types of data (such as shotgun metagenomic data).
28 Jun 2021 shellcheck updated to version 0.7.2
A shell script static analysis tool
24 Jun 2021 MD-TASK updated to version 1.0.0
MD-TASK is a software suite for molecular dynamics (MD). It employs graph theory techniques, perturbation response scanning, and dynamic cross-correlation to provide unique ways for analyzing MD trajectories.
23 Jun 2021 AdmixTools updated to version 7.0.2
ADMIXTOOLS is a software package that supports formal tests of whether admixture occurred, and makes it possible to infer admixture proportions and dates.
22 Jun 2021 Matlab updated to version 2021a
MATLAB is an interactive software package for scientific and engineering numeric computation. MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an environment where problems and solutions are expressed just as they are written mathematically.
21 Jun 2021 pandoc updated to version 2.14.0.2
Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Location
14 Sep 2021NCBI ntFasta/fdb/fastadb/nt.fas
14 Sep 2021NCBI nrFasta/fdb/fastadb/nr.aa.fas
14 Sep 2021SwissProtFasta/fdb/fastadb/swissprot.aa.fas
14 Sep 2021Protein Data BankFasta/fdb/fastadb/pdb.aa.fas
14 Sep 2021NCBI Taxonomytaxonomy/fdb/taxonomy
11 Sep 2021NCBI ntBlast/fdb/blastdb/nt
11 Sep 2021NCBI nrBlast/fdb/blastdb/nr
11 Sep 2021SwissProtBlast/fdb/blastdb/swissprot
11 Sep 2021Protein Data BankBlast/fdb/blastdb/pdbaa
10 Sep 20211000 GenomesVCF/fdb/1000genomes/
05 Sep 2021Protein Data BankBlast/fdb/blastdb/pdbnt
26 Aug 2021dbNSFP - Human Genome hg19dbNSFP/fdb/dbNSFP/
24 Aug 2021ZINC20Mol2/fdb/zinc20
22 Aug 2021Mouse Genome (Mus musculus) mm8MySQLNIH mirror of UCSC Genome Browser
08 Aug 2021Rat Genome (Rattus norvegicus) rn4MySQLNIH mirror of UCSC Genome Browser
08 Aug 2021Human Genome hg18MySQLNIH mirror of UCSC Genome Browser
08 Aug 2021Dog Genome (Canis familiaris)MySQLNIH mirror of UCSC genome browser
28 Jul 2021ANNOVARANNOVAR/fdb/annovar/current