Biowulf High Performance Computing at the NIH
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
16 Jan 2022 nodejs updated to version 16.13.2
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. module name: nodejs
15 Jan 2022 minimap2 updated to version 2.24
Minimap2 is a fast sequence mapping and alignment program that can find overlaps between long noisy reads, or map long reads or their assemblies to a reference genome optionally with detailed alignment (i.e. CIGAR).
15 Jan 2022 primer3 updated to version 2.6.0
Primer3 is a program for designing PCR primers, hybridization probes, and sequencing primers.
14 Jan 2022 diamond updated to version 2.0.14
DIAMOND is a new high-throughput program for aligning DNA reads or protein sequences against a protein reference database such as NR, at up to 20,000 times the speed of BLAST, with high sensitivity.
14 Jan 2022 wget updated to version 1.21
a free software package for retrieving files using HTTP, HTTPS, FTP and FTPS
14 Jan 2022 MUSCLE updated to version 5.0.148
Fast Multiple Sequence Alignment program.
13 Jan 2022 mdtraj updated to version 1.9.7
MDTraj is a python library that allows users to manipulate molecular dynamics (MD) trajectories and perform a variety of analyses, including fast RMSD, solvent accessible surface area, hydrogen bonding, etc.
13 Jan 2022 picrust updated to version 2.4.2
PICRUSt is a bioinformatics software package designed to predict metagenome functional content from marker gene (e.g., 16S rRNA) surveys and full genomes.
11 Jan 2022 Huygens updated to version 21.10.0-p1
Huygens is an image restoration, deconvolution, resolution and noise reduction. It can process images from all current optical microscopes, including wide-field, confocal, Nipkow (scanning disk confocal), multiple-photon, and 4Pi microscopes.
11 Jan 2022 Comsol updated to version 6.0.0.318
The COMSOL Multiphysics engineering simulation software environment facilitates all steps in the modeling process − defining your geometry, meshing, specifying your physics, solving, and then visualizing your results.
11 Jan 2022 PartekFlow updated to version 10.0.22.0102
Web interface designed specifically for the analysis needs of next generation sequencing applications including RNA, small RNA, and DNA sequencing.
7 Jan 2022 guppy updated to version 6.0.1
Local accelerated basecalling for Nanopore data
7 Jan 2022 MIPAV updated to version 11.0.1
The MIPAV (Medical Image Processing, Analysis, and Visualization) application enables quantitative analysis and visualization of medical images of numerous modalities such as PET, MRI, CT, or microscopy.
6 Jan 2022 MEGA updated to version 11.0.10
MEGA, Molecular Evolutionary Genetics Analysis, is a software suite for analyzing DNA and protein sequence data from species and populations
5 Jan 2022 GATK updated to version 4.2.4.1
GATK, from the Broad Institute, is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.
5 Jan 2022 sb_cli updated to version 0.18.4
Use the Seven Bridges Command Line Interface (SB CLI) to programmatically access and automate your interaction with the Platform via the API. The CLI is called by a simple command: sb.
4 Jan 2022 vim updated to version 8.2
a text editor that is upwards compatible to Vi. It can be used to edit all kinds of plain text. It is especially useful for editing programs with syntactical coloring.
4 Jan 2022 combp updated to version 0.50.6
A library to combine, analyze, group and correct p-values in BED files. Unique tools involve correction for spatial autocorrelation. This is useful for ChIP-Seq probes and Tiling arrays, or any data with spatial correlation.
4 Jan 2022 SimNIBS updated to version 3.2.5
SimNIBS is a free software package for the Simulation of Non-invasive Brain Stimulation. It allows for realistic calculations of the electric field induced by transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS).
23 Dec 2021 fmriprep updated to version 21.0.0
A Robust Preprocessing Pipeline for fMRI Data
21 Dec 2021 Hail updated to version 0.2.81
Hail is an open-source, scalable framework for exploring and analyzing genomic data.
20 Dec 2021 picard updated to version 2.26.9
Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.
20 Dec 2021 CCP4 updated to version 7.1.018
CCP4 is a suite of programs for protein crystallography and structural biology.
20 Dec 2021 Chimera updated to version 1.16.0
Chimera is a highly extensible program for interactive visualization and analysis of molecular structures and related data, including density maps, supramolecular assemblies, sequence alignments, docking results, trajectories, and conformational ensembles.
20 Dec 2021 snpEff updated to version 5.0e
snpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of variants on genes (such as amino acid changes).
20 Dec 2021 Madeline2 updated to version 2.0
The Madeline 2.0 Pedigree Drawing Engine (PDE) is a pedigree drawing program for use in linkage and family-based association studies. The program is designed to handle large and complex pedigrees with an emphasis on readability and aesthetics.
17 Dec 2021 fanc updated to version 0.9.21
FAN-C is a toolkit for the analysis and visualization of Hi-C data. Beyond objects generated within FAN-C, the toolkit is largely compatible with Hi-C files from Cooler and Juicer.
17 Dec 2021 Beagle updated to version 5.2_28Jun21
Beagle is a package for imputing genotypes, inferring haplotype phase, and performing genetic association analysis. BEAGLE is designed to analyze large-scale data sets with hundreds of thousands of markers genotyped on thousands of samples.
16 Dec 2021 Perl updated to version 5.34.0
Perl is a highly capable, feature-rich programming language with over 23 years of development.
16 Dec 2021 Genome Browser updated to version 424
The Genome Browser Mirror Fragments is a mirror of the UCSC Genome Browser. The URL is https://hpcnihapps.cit.nih.gov/genome. Users can also access the MySQL databases, supporting files directly, and a huge number of associated executables.
15 Dec 2021 singularity updated to version 3.8.5-1
Singularity is a container platform focused on supporting ``Mobility of Compute``. It allows users to emulate, and share custom Linux environments allowing for the creation of self-contained development stacks.
15 Dec 2021 nextflow updated to version 21.10.5
Data-driven computational pipelines
15 Dec 2021 deepconsensus updated to version 0.1.0
DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data
15 Dec 2021 golang updated to version 1.17.5
The Go programming language
15 Dec 2021 VEP updated to version 105.0
VEP (Variant Effect Predictor) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.
14 Dec 2021 pandoc updated to version 2.16.2
Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.
14 Dec 2021 git updated to version 2.34.1
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
13 Dec 2021 scvelo updated to version 0.2.3
scVelo is a method to describe the rate of gene expression change for an individual gene at a given time point based on the ratio of its spliced and unspliced messenger RNA (mRNA). It avoids errors in the velocity estimates by solving the full transcriptional dynamics of splicing kinetics using a likelihood-based dynamical model. This generalizes RNA velocity to systems with transient cell states, which are common in development and in response to perturbations.
8 Dec 2021 Freesurfer updated to version 7.2.0
Freesurfer is a set of automated tools for reconstruction of the brain's cortical surface from structural MRI data, and overlay of functional MRI data onto the reconstructed surface.
7 Dec 2021 nibabies updated to version 21.0.2
Preprocessing pipeline for neonate and infant MRI.
2 Dec 2021 posefilter updated to version 619eca51
PoseFilter is a PyMOL plugin and assists in the analysis of docked ligands through identification of unique oligomeric poses by utilizing RMSD and interaction fingerprint analysis methods.
2 Dec 2021 pydockrmsd updated to version 0.0.30
DockRMSD is capable of deterministically identifying the minimum symmetry-corrected RMSD and is able to do so without significant loss of computational efficiency compared to other methods.
2 Dec 2021 novactf updated to version 1.0
NovaCTF is a freeware for 3D-CTF correction for electron microscopy.
2 Dec 2021 viennarna updated to version 2.5.0
RNA Secondary Structure Prediction and Comparison
1 Dec 2021 cellranger updated to version 6.1.2
Cell Ranger is a set of analysis pipelines that processes Chromium single cell 3’ RNA-seq output to align reads, generate gene-cell matrices and perform clustering and gene expression analysis.
1 Dec 2021 ANNOVAR updated to version 2020-06-08
ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes.
30 Nov 2021 salmon updated to version 1.5.2
a tool for quantifying the expression of transcripts using RNA-seq data.
30 Nov 2021 exomiser updated to version 13.0.1
The Exomiser is a Java program that functionally annotates variants from whole-exome sequencing data starting from a VCF file.
29 Nov 2021 gsea updated to version 4.1.0
Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes).
29 Nov 2021 boost updated to version 1.77
Boost provides free peer-reviewed portable C++ source libraries. Boost libraries are intended to be widely useful, and usable across a broad spectrum of applications.
29 Nov 2021 ctk updated to version 1.1.4
The CLIP Tool Kit (CTK) is a software package that provides a set of tools for analysis of CLIP data starting from the raw reads generated by the sequencer.
29 Nov 2021 glew updated to version 2.2.0
The OpenGL Extension Wrangler Library (GLEW) is a cross-platform open-source C/C++ extension loading library. GLEW provides efficient run-time mechanisms for determining which OpenGL extensions are supported on the target platform.
24 Nov 2021 SynthDNM updated to version 0.1.3
SynthDNM is a random-forest based classifier that can be readily adapted to new sequencing or variant-calling pipelines by applying a flexible approach to constructing simulated training examples from real data. The optimized SynthDNM classifiers predict de novo SNPs and indels with robust accuracy across multiple methods of variant calling.
23 Nov 2021 subtom updated to version 1.1.6-32f731b
Subtom is a pipeline for subvolume alignment and averaging of electron cryo-tomography data.
23 Nov 2021 parallel updated to version 20211122
GNU parallel is a shell tool for executing jobs in parallel using one or more computers.
22 Nov 2021 htseq updated to version 1.99.2
HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays.
22 Nov 2021 patchelf updated to version 0.13
patchelf is a small utility to modify the dynamic linker and RPATH of ELF executables.
22 Nov 2021 EPACTS updated to version 3.4.2
EPACTS (Efficient and Parallelizable Association Container Toolbox) is a versatile software pipeline to perform various statistical tests for identifying genome-wide association from sequence data through a user-friendly interface, both to scientific analysts and to method developers.
19 Nov 2021 stripenn updated to version 1.1.50
Stripenn is a command line interface python package developed for detection of atchitectural stripes from chromatin conformation capture (3C) data. It implements an algorithm rooted in computer vision for demarcation and quantification of the architectural stripes. Stripenn was demonstrated to outperform existing methods, be applicable in the context of analysis of B and T lymphocytes, and to allow examination of the role of sequence variation on the architectural stripes by studying the conservation of these features in inbred strains of mice.
19 Nov 2021 DeepCAD updated to version 20210826
DeepCAD is a self-supervised deep-learning method for spatiotemporal enhancement of calcium imaging data that does not require any high signal-to-noise ratio (SNR) observations. DeepCAD suppresses detection noise and improves the SNR more than tenfold, which reinforces the accuracy of neuron extraction and spike inference and facilitates the functional analysis of neural circuits.
18 Nov 2021 gridss updated to version 2.12.2
GRIDSS is a module software suite containing tools useful for the detection of genomic rearrangements. GRIDSS includes a genome-wide break-end assembler, as well as a structural variation caller for Illumina sequencing data. GRIDSS calls variants based on alignment-guided positional de Bruijn graph genome-wide break-end assembly, split read, and read pair evidence.
17 Nov 2021 vcfanno updated to version 0.3.3
annotate a VCF with other VCFs/BEDs/tabixed files
16 Nov 2021 seqkit updated to version 2.1.0
A cross-platform toolkit for FASTA/Q file manipulation
16 Nov 2021 king updated to version 2.2.7
KING is a toolset to explore genotype data from a genome-wide association study (GWAS) or a sequencing project. KING can be used to check family relationship and flag pedigree errors by estimating kinship coefficients and inferring IBD segments for all pairwise relationships.
15 Nov 2021 alphafold2 updated to version 2.1.1
This package provides an implementation of the protein structure inference pipeline of AlphaFold v2.0.
15 Nov 2021 philosopher updated to version 4.1.0
Philosopher is fast, easy-to-use, scalable, and versatile data analysis software for mass spectrometry-based proteomics. Philosopher is dependency-free and can analyze both traditional database searches and open searches for post-translational modification (PTM) discovery.
15 Nov 2021 msfragger updated to version 3.4
An ultrafast database search tool for peptide identification in mass spectrometry-based proteomics.
15 Nov 2021 fragpipe updated to version 17.0
FragPipe is a Java Graphical User Interface (GUI) for a suite of computational tools enabling comprehensive analysis of mass spectrometry-based proteomics data. It is powered by MSFragger.
15 Nov 2021 abyss updated to version 2.3.2
Abyss represents Assembly By Short Sequences - a de novo, parallel, paired-end sequence assembler. The parallel version is implemented using MPI and is capable of assembling larger genomes.
15 Nov 2021 plink updated to version 2.3-alpha
PLINK is whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
10 Nov 2021 bcl-convert updated to version 3.9.3
The Illumina BCL Convert is a standalone local software app that converts the Binary Base Call (BCL) files produced by Illumina sequencing systems to FASTQ files. BCL Convert also provides adapter handling (through masking and trimming) and UMI trimming and produces metric outputs.
10 Nov 2021 metaphlan updated to version 3.0.13
MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.
9 Nov 2021 fastq_screen updated to version 0.15.0
FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.
9 Nov 2021 globus-cli updated to version 3.1.3
Globus command line interface
9 Nov 2021 shellcheck updated to version 0.8.0
A shell script static analysis tool
8 Nov 2021 humann updated to version 3.0.0
HUMAnN is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads).
8 Nov 2021 spaceranger updated to version 1.3.1
10x pipeline for processing Visium spatial RNA-seq data
4 Nov 2021 RoseTTAFold updated to version 1.1.0
Accurate prediction of protein structures and interactions using a 3-track network, , in which information at the 1D sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated.
2 Nov 2021 MoChA updated to version 1.10.2; 1.11
MoChA is a bcftools extension to call mosaic chromosomal alterations starting from phased VCF files with either B Allele Frequency (BAF) and Log R Ratio (LRR) or allelic depth (AD)
2 Nov 2021 swig updated to version 4.0.2
SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages.
1 Nov 2021 mosdepth updated to version 0.3.2
Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.
1 Nov 2021 telescope updated to version 1.0.3.1
Single locus resolution of Transposable ELEment expression. Telescope estimates transposable element expression (retrotranscriptome) resolved to specific genomic locations. It directly addresses uncertainty in fragment assignment by reassigning ambiguously mapped fragments to the most probable source transcript as determined within a Bayesian statistical model.
26 Oct 2021 hicpro updated to version 3.1.0
HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
25 Oct 2021 ChimeraX updated to version 1.2.5
UCSF ChimeraX (or simply ChimeraX) is the next-generation molecular visualization program from the Resource for Biocomputing, Visualization, and Informatics (RBVI), following UCSF Chimera.
25 Oct 2021 trust4 updated to version 1.0.5.1
Tcr Receptor Utilities for Solid Tissue (TRUST) is a computational tool to analyze TCR and BCR sequences using unselected RNA sequencing data, profiled from solid tissues, including tumors. TRUST4 performs de novo assembly on V, J, C genes including the hypervariable complementarity-determining region 3 (CDR3) and reports consensus of BCR/TCR sequences. TRUST4 then realigns the contigs to IMGT reference gene sequences to report the corresponding information. TRUST4 supports both single-end and paired-end sequencing data with any read length.
22 Oct 2021 DNAnexus updated to version 0.314.0
DNAnexus is a cloud-based commercial solution for next-generation sequence analysis and visualization. It has a command-line interface (CLI) which can be used to log in to the DNAnexus platform, upload and navigate data, and launch analyses.
20 Oct 2021 HPhi updated to version 3.5.0
HPhi is a numerical solver package for a wide range of quantum lattice models including Hubbard-type itinerant electron hamiltonians, quantum spin models, and Kondo-type hamiltonians for itinerant electrons coupled with quantum spins. The Lanczos algorithm for finding ground states and newly developed Lanczos-based algorithm for finite-temperature properties of these models are implemented for parallel computing.
20 Oct 2021 fastp updated to version 0.23.1
A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Location
17 Jan 2022NCBI Taxonomytaxonomy/fdb/taxonomy
13 Jan 2022NCBI nrBlast/fdb/blastdb/nr
11 Jan 2022NCBI ntFasta/fdb/fastadb/nt.fas
11 Jan 2022NCBI nrFasta/fdb/fastadb/nr.fas
11 Jan 2022SwissProtFasta/fdb/fastadb/swissprot.aa.fas
11 Jan 2022Protein Data BankFasta/fdb/fastadb/pdb.aa.fas
08 Jan 2022Protein Data BankBlast/fdb/blastdb/pdbnt
07 Jan 2022NCBI ntBlast/fdb/blastdb/nt
05 Jan 2022SwissProtBlast/fdb/blastdb/swissprot
05 Jan 2022Protein Data BankBlast/fdb/blastdb/pdbaa
02 Jan 2022Rat Genome (Rattus norvegicus) rn4MySQLNIH mirror of UCSC Genome Browser
02 Jan 2022Dog Genome (Canis familiaris)MySQLNIH mirror of UCSC genome browser
15 Dec 2021Cat genome (Felis Catus) 9.0Fasta/fdb/ensembl/pub/release-96/fasta/felis_catus
15 Dec 2021COSMICVCF/fdb/COSMIC
01 Dec 2021ANNOVARANNOVAR/fdb/annovar/current
04 Nov 20211000 Genomesminimac/fdb/minimac
02 Nov 2021PFAMPFAM/fdb/fastadb/pfam
02 Nov 2021RepBaseRepBase/fdb/repbase