Biowulf High Performance Computing at the NIH
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
19 Apr 2021 shapeit updated to version 4.2.1
SHAPEIT is a fast and accurate haplotype inference software
19 Apr 2021 mothur updated to version 1.45.2
mothur is a tool for analyzing 16S rRNA gene sequences generated on multiple platforms as part of microbial ecology projects.
19 Apr 2021 PyRosetta updated to version 279.py3.7
PyRosetta is an interactive Python-based interface to the powerful Rosetta molecular modeling suite. It enables users to design their own custom molecular modeling algorithms using Rosetta sampling methods and energy functions.
19 Apr 2021 xpdf updated to version 4.03
Xpdf is a free PDF viewer and toolkit, including a text extractor, image converter, HTML converter, and more. Most of the tools are available as open source.
18 Apr 2021 PartekFlow updated to version 10.0.21.0411
Web interface designed specifically for the analysis needs of next generation sequencing applications including RNA, small RNA, and DNA sequencing.
16 Apr 2021 SimNIBS updated to version 3.2.2
SimNIBS is a free software package for the Simulation of Non-invasive Brain Stimulation. It allows for realistic calculations of the electric field induced by transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS).
15 Apr 2021 R updated to version 4.0.5
R (the R Project) is a language and environment for statistical computing and graphics. R is similar to S, and provides a wide variety of statistical and graphical techniques (linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, ...).
15 Apr 2021 minimap2 updated to version 2.18
Minimap2 is a fast sequence mapping and alignment program that can find overlaps between long noisy reads, or map long reads or their assemblies to a reference genome optionally with detailed alignment (i.e. CIGAR).
13 Apr 2021 tmux updated to version 3.2
tmux is a terminal multiplexer.
Type 'module load tmux' to load the module, then 'tmux --help'
12 Apr 2021 vsearch updated to version 2.16.0
VSEARCH supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
9 Apr 2021 subread updated to version 2.0.2
High-performance read alignment, quantification and mutation discovery
9 Apr 2021 sicer updated to version 2-1.0.3
A clustering approach for identification of enriched domains from histone modification ChIP-Seq data
8 Apr 2021 symmetry updated to version 2.1.0
This project collects tools to detect, analyze, and visualize protein symmetry.
8 Apr 2021 bonito updated to version 0.3.6
A PyTorch Basecaller for Oxford Nanopore Reads
6 Apr 2021 IMOD updated to version 4.11.5
IMOD is a set of image processing, modeling and display programs used for tomographic reconstruction and for 3D reconstruction of EM serial sections and optical sections.
6 Apr 2021 PEET updated to version 1.15.0
PEET (Particle Estimation for Electron Tomography) is an open-source package for aligning and averaging particles in 3-D subvolumes extracted from tomograms. It seeks the optimal alignment of each particle against a reference volume through several iterations. If PEET and IMOD are both installed, most PEET operations are available from the eTomo graphical user interface in IMOD.
5 Apr 2021 STAR updated to version 2.7.8a
Spliced Transcripts Alignment to a Reference
5 Apr 2021 STAR-Fusion updated to version 1.10.0
Transcript fusion detection
2 Apr 2021 pangolin updated to version 2.3.6
Phylogenetic Assignment of Named Global Outbreak LINeages. PANGOLIN is a system for identifying phylogenetic COVID lineages that contribute most to active spread.
2 Apr 2021 lumpy updated to version 0.3.1
A probabilistic framework for structural variant discovery.
2 Apr 2021 cryolo updated to version 1.7.6.3
Automated particle picker for cryo-EM
2 Apr 2021 winnowmap updated to version 2.0
winnowmap is used for mapping ONT and PacBio reads to repetitive reference sequences.
31 Mar 2021 git updated to version 2.31.1
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
30 Mar 2021 fanc updated to version 0.9.17
FAN-C is a toolkit for the analysis and visualization of Hi-C data. Beyond objects generated within FAN-C, the toolkit is largely compatible with Hi-C files from Cooler and Juicer.
30 Mar 2021 fade updated to version 0.2.2
Fragmentase Artifact Detection and Elimination
30 Mar 2021 stringtie updated to version 2.1.5
StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It is primarily a genome-guided transcriptome assembler, although it can borrow algorithmic techniques from de novo genome assembly to help with transcript assembly.
30 Mar 2021 vcf2maf updated to version 1.6.20
A smarter, more reproducible, and more configurable tool for converting a VCF to a MAF.
29 Mar 2021 spades updated to version 3.15.2
SPAdes – St. Petersburg genome assembler – is intended for both standard isolates and single-cell MDA bacteria assemblies.
29 Mar 2021 rnaseqc updated to version 2.4.2
RNA-SeQC is a java program which computes a series of quality control metrics for RNA-seq data.
29 Mar 2021 bwa-mem2 updated to version 2.2.1
The next version of the bwa-mem algorithm in bwa.
29 Mar 2021 VADR updated to version 1.1.3
VADR stands for Viral Annotation DefineR. It is a suite of tools for classifying and analyzing sequences homologous to a set of reference models of viral genomes or gene families. It has been mainly tested for analysis of Norovirus, Dengue, and SARS-CoV-2 virus sequences in preparation for submission to the GenBank database.
26 Mar 2021 iVar updated to version 1.3.1
iVar is a computational package that contains functions broadly useful for viral amplicon-based sequencing. Additional tools for metagenomic sequencing are actively being incorporated into iVar. While each of these functions can be accomplished using existing tools, iVar contains an intersection of functionality from multiple tools that are required to call iSNVs and consensus sequences from viral sequencing data across multiple replicates.
25 Mar 2021 augustus updated to version 3.4.0
AUGUSTUS is a program that predicts genes in eukaryotic genomic sequences.
25 Mar 2021 metabat updated to version 2.15
MetaBAT: A robust statistical framework for reconstructing genomes from metagenomic data
25 Mar 2021 abyss updated to version 2.3.0
Abyss represents Assembly By Short Sequences - a de novo, parallel, paired-end sequence assembler. The parallel version is implemented using MPI and is capable of assembling larger genomes.
24 Mar 2021 magetbrain updated to version 1.0
Given a set of labelled MR images (atlases) and unlabelled images (subjects), MAGeT produces a segmentation for each subject using a multi-atlas voting procedure based on a template library made up of images from the subject set.
24 Mar 2021 pandoc updated to version 2.13
Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.
24 Mar 2021 kronatools updated to version 2.8
Krona allows hierarchical data to be explored with zooming, multi-layered pie charts. Krona charts can be created using an Excel template or KronaTools, which includes support for several bioinformatics tools and raw data formats. The interactive charts are self-contained and can be viewed with any modern web browser.
24 Mar 2021 ExpansionHunter updated to version 4.0.2
Expansion Hunter: a tool for estimating repeat sizes. There are a number of regions in the human genome consisting of repetitions of short unit sequence (commonly a trimer). Such repeat regions can expand to a size much larger than the read length and thereby cause a disease. Expansion Hunter aims to estimate sizes of such repeats by performing a targeted search through a BAM/CRAM file for reads that span, flank, and are fully contained in each repeat.
23 Mar 2021 gdc-client updated to version 1.6.0
The GDC Data Transfer Tool provides an optimized method of transferring data to and from the GDC, and enables resumption of interrupted transfers.
23 Mar 2021 parallel updated to version 20210322
GNU parallel is a shell tool for executing jobs in parallel using one or more computers.
22 Mar 2021 bpipe updated to version 0.9.10
Bpipe provides a platform for running big bioinformatics jobs
22 Mar 2021 gmap-gsnap updated to version 2021-03-08
A Genomic Mapping and Alignment Programs
21 Mar 2021 RepeatMasker updated to version 4.1.2
RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). On average, almost 50% of a human genomic DNA sequence currently will be masked by the program.
21 Mar 2021 rmblast updated to version 2.11.0
RMBlast is a RepeatMasker-compatible version of the standard NCBI blastn program. RMBlast supports RepeatMasker searches by adding a few necessary features to the stock NCBI blastn program.
19 Mar 2021 minc-toolkit updated to version 1.9.18
This metaproject bundles multiple MINC-based packages that historically have been developed somewhat independently
19 Mar 2021 svtools updated to version 0.5.1
Comprehensive utilities to explore structural variations in genomes
19 Mar 2021 vireosnp updated to version 0.5.1
Demultiplexing pooled scRNA-seq data without genotype reference
18 Mar 2021 svtyper updated to version 0.7.1
Svtyper is a Bayesian genotyper for structural variants.
18 Mar 2021 mriqc updated to version 0.16.1
MRIQC is an MRI quality control tool
18 Mar 2021 PyMOL updated to version 2.4.0
A comprehensive molecular visualization product for rendering and animating 3D molecular structures.
18 Mar 2021 ncbi-vdb updated to version 2.11.0
The SRA Toolkit and SDK from NCBI is a collection of tools and libraries for using data in the INSDC Sequence Read Archives.
18 Mar 2021 ncbi-ngs updated to version 2.11.0
NCBI's NGS is a new, domain-specific API for accessing reads, alignments and pileups produced from Next Generation Sequencing
18 Mar 2021 hisat updated to version 2.2.2.1-ngs2.11.0
HISAT is a fast and sensitive spliced alignment program which uses Hierarchical Indexing for Spliced Alignment of Transcripts.
18 Mar 2021 sratoolkit updated to version 2.11.0
The NCBI SRA Toolkit enables reading ("dumping") of sequencing files from the SRA database and writing ("loading") files into the .sra format.
18 Mar 2021 Coot updated to version 0.9.5
Coot is for macromolecular model building, model completion and validation, particularly suitable for protein modelling using X-ray data.
17 Mar 2021 vg updated to version 1.31.0
Tools for working with genome variation graphs
17 Mar 2021 trinity updated to version 2.12.0
Trinity, developed at the Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.
16 Mar 2021 tetoolkit updated to version 2.2.1
A package for including transposable elements in differential enrichment analysis of sequencing datasets.
16 Mar 2021 snakemake updated to version 6.0.5
Snakemake aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style. It is well suited for bioinformatic workflows.
15 Mar 2021 OpenBabel updated to version 3.1.1
Open Babel is a chemical toolbox designed to speak the many languages of chemical data.
15 Mar 2021 EMAN2 updated to version 2.91
EMAN2 is a broadly based greyscale scientific image processing suite with a primary focus on processing data from transmission electron microscopes.
11 Mar 2021 SAIGE updated to version 0.44.1
R package for large-scale genetic association studies.
11 Mar 2021 multiqc updated to version 1.10
aggregates results for various frequently used bioinformatics tools across multiple samples into a nice visual report
10 Mar 2021 RevBayes updated to version 1.1.1
Bayesian phylogenetic inference using probabilistic graphical models and an interpreted language
9 Mar 2021 civet updated to version 2.1.1
civet is a brain-imaging pipeline for analysis of large MR data sets. civet extracts and analyses cortical surfaces from MR images, as well as many other volumetric and corticometric functions.
9 Mar 2021 ITK-SNAP updated to version 3.8.0
ITK-SNAP is a tool for segmentation of 3D biomedical images. It requires a graphical connection to run on the cluster.
9 Mar 2021 busco updated to version 5.0.0
BUSCO completeness assessments employ sets of Benchmarking Universal Single-Copy Orthologs from OrthoDB (www.orthodb.org) to provide quantitative measures of the completeness of genome assemblies, annotated gene sets, and transcriptomes in terms of expected gene content.
8 Mar 2021 protobuf updated to version 3.15.5
Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data. Think XML, but smaller, faster, and simpler.
8 Mar 2021 RELION updated to version 3.1.2
RELION (for REgularised LIkelihood OptimisatioN) is a stand-alone computer program for Maximum A Posteriori refinement of (multiple) 3D reconstructions or 2D class averages in cryo-electron microscopy.
5 Mar 2021 pvactools updated to version 2.0.1
pVACtools is a cancer immunotherapy suite consisting of pVACseq, pVACfuse, pVACvector
5 Mar 2021 IGVTools updated to version 2.9.2
IGVTools provides utilities for working with ascii file formats used by the Integrated Genome Viewer. The files can be sorted, tiled, indexed, and counted.
5 Mar 2021 IGV updated to version 2.9.2
The Integrative Genomics Viewer is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.
5 Mar 2021 dynamo updated to version 1.15.14
Dynamo is a software environment for subtomogram averaging of cryo-EM data.
4 Mar 2021 pomoxis updated to version 0.3.4
Pomoxis comprises a set of basic bioinformatic tools tailored to nanopore sequencing. Notably tools are included for generating and analysing draft assemblies. Many of these tools are used by the research data analysis group at Oxford Nanopore Technologies.
4 Mar 2021 biobambam2 updated to version 2.0.179-release-20201228191456
Tools for early stage alignment file processing.
4 Mar 2021 breseq updated to version 0.35.5
breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data. It is intended for haploid microbial genomes (<20 Mb).
4 Mar 2021 mash updated to version 2.3
mash is a command line tool and library to provide fast genome and metagenome distance estimation using MinHash. Only command line tool is installed
4 Mar 2021 cpdf updated to version 2.3.2
Coherent PDF tools
4 Mar 2021 bcbio-nextgen updated to version 1.2.7
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
3 Mar 2021 GAMESS updated to version 30Sep20-R2
GAMESS is a general ab initio quantum chemistry package.
3 Mar 2021 dcm2niix updated to version 1.0.20201102
DICOM to NIfTI converter
2 Mar 2021 hyphy updated to version 2.5.29
HyPhy (Hypothesis Testing using Phylogenies) is an open-source software package for the analysis of genetic sequences (in particular the inference of natural selection) using techniques in phylogenetics, molecular evolution, and machine learning.
2 Mar 2021 Scipion updated to version 3.0
Scipion is an image processing framework to obtain 3D models of macromolecular complexes using Electron Microscopy (3DEM). It integrates several software packages and presents an unified interface for both biologists and developers. Scipion allows to execute workflows combining different software tools, while taking care of formats and conversions. Additionally, all steps are tracked and can be reproduced later on.
1 Mar 2021 boost updated to version 1.75
Boost provides free peer-reviewed portable C++ source libraries. Boost libraries are intended to be widely useful, and usable across a broad spectrum of applications.
25 Feb 2021 freeglut updated to version 3.2.1
FreeGLUT is a free-software/open-source alternative to the OpenGL Utility Toolkit (GLUT) library.
24 Feb 2021 GATK updated to version 4.2.0.0
GATK, from the Broad Institute, is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.
24 Feb 2021 cryoDRGN updated to version 0.3.1
CryoDRGN is an algorithm that leverages the representation power of deep neural networks to directly reconstruct continuous distributions of 3D density maps and map per-particle heterogeneity of single-particle cryo-EM datasets. It contains interactive tools to visualize a dataset’s distribution of per-particle variability, generate density maps for exploratory analysis, extract particle subsets for use with other tools and generate trajectories to visualize molecular motions.
23 Feb 2021 glpk updated to version 5.0
The GLPK (GNU Linear Programming Kit) package is intended for solving large-scale linear programming (LP), mixed integer programming (MIP), and other related problems. It is a set of routines written in ANSI C and organized in the form of a callable library.
23 Feb 2021 nodejs updated to version 14.16.0
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. module name: nodejs
23 Feb 2021 kraken updated to version 2.1.1
Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies
23 Feb 2021 VEP updated to version 103.1
VEP (Variant Effect Predictor) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.
23 Feb 2021 cromwell updated to version 57
A Workflow Management System geared towards scientific workflows.
23 Feb 2021 fmriprep updated to version 20.2.1
A Robust Preprocessing Pipeline for fMRI Data
23 Feb 2021 libtiff updated to version 4.2.0
This software provides support for the Tag Image File Format (TIFF), a widely used format for storing image data.
23 Feb 2021 Genome Browser updated to version 410
The Genome Browser Mirror Fragments is a mirror of the UCSC Genome Browser. The URL is https://hpcnihapps.cit.nih.gov/genome. Users can also access the MySQL databases, supporting files directly, and a huge number of associated executables.
23 Feb 2021 lammps updated to version 29Oct20
LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. It runs on a variety of different computer systems, including single processor systems, distributed-memory machines with MPI, and GPU and Xeon Phi systems. LAMMPS is open source software, released under the GNU General Public License.
23 Feb 2021 libpng updated to version 1.6.37
libpng is the official PNG reference library. It supports almost all PNG features, is extensible, and has been extensively tested for over 20 years.
22 Feb 2021 Rstudio updated to version 1.4.1103
RStudio is a set of integrated tools designed to help you be more productive with R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management.
22 Feb 2021 novocraft updated to version 4.03.02
Package includes aligner for single-ended and paired-end reads from the Illumina Genome Analyser. Novoalign finds global optimum alignments using full Needleman-Wunsch algorithm with affine gap penalties.
22 Feb 2021 spaceranger updated to version 1.2.2
10x pipeline for processing Visium spatial RNA-seq data
18 Feb 2021 Schrodinger updated to version 2020.4
A limited number of Schrödinger applications are available on the Biowulf cluster through the Molecular Modeling Interest Group. Most are available through the Maestro GUI.
18 Feb 2021 MQLS updated to version 1.5
MQLS ("More Poweful" or "Modified" Quasi-likelihood Score Test) is a program for case-control association testing of a binary trait in samples that contain related individuals.
18 Feb 2021 preseq updated to version 3.1.1
predicting library complexity and genome coverage in high-throughput sequencing
17 Feb 2021 QIIME updated to version 2-2020.11
QIIME is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rRNA) generated on a variety of platforms, but also supporting analysis of other types of data (such as shotgun metagenomic data).
16 Feb 2021 Phenix updated to version 1.19.1-4122
PHENIX is a software suite for the automated determination of macromolecular structures using X-ray crystallography and other methods.
12 Feb 2021 freebayes updated to version 1.3.5
Bayesian haplotype-based polymorphism discovery and genotyping
11 Feb 2021 Meep updated to version 1.17.1
Meep (or MEEP) is a free finite-difference time-domain (FDTD) simulation software package developed at MIT to model electromagnetic systems, along with the MPB eigenmode package.
11 Feb 2021 picard updated to version 2.25.0
Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.
11 Feb 2021 VIRTUS updated to version 1.2.1
Bioinformatics pipeline for viral transcriptome detection.
10 Feb 2021 m-tools updated to version 20210208
A selection of software developed at the Australian Centre for Ecogenomics to aid in the analysis of metagenomic datasets: unitem, refinem, checkm, graftm, groopm, bamm, finishm, singlem, orfm, and coverm
9 Feb 2021 infernal updated to version 1.1.4
Package for searching DNA sequence databases for RNA structure and sequence similarities
8 Feb 2021 globus-cli updated to version 2.2.0
Globus command line interface
8 Feb 2021 cicero updated to version 1.4.0
an assembly-based algorithm to detect diverse classes of driver gene fusions from RNA-seq.
4 Feb 2021 pf_refinement updated to version 20Nov19
Protofilament Refinement is a software package to further refine microtubule structures, by aligning individual protofilaments rather than full microtubule segements.
2 Feb 2021 AMBER updated to version 20
AMBER (Assisted Model Building with Energy Refinement) is a package of molecular simulation programs.
2 Feb 2021 cellsnp-lite updated to version 1.2.0
Efficient genotyping bi-allelic SNPs on single cells
2 Feb 2021 visidata updated to version 2.2
VisiData is an interactive multitool for tabular data
1 Feb 2021 cellsnp updated to version 0.3.2
Pileup biallelic SNPs from single-cell and bulk RNA-seq data
28 Jan 2021 Matlab updated to version 2020b
MATLAB is an interactive software package for scientific and engineering numeric computation. MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an environment where problems and solutions are expressed just as they are written mathematically.
27 Jan 2021 deepsignal updated to version 0.1.8
A deep-learning method for detecting DNA methylation state from Oxford Nanopore sequencing reads.
27 Jan 2021 EDirect updated to version 14.5
Entrez Direct (EDirect) is an advanced method for accessing the NCBI's set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window.
26 Jan 2021 IDL/ENVI updated to version 8.8/5.6
IDL and ENVI are a complete computing environment for the interactive analysis and visualization of data. IDL integrates an array-oriented language with mathematical analysis and graphical display techniques. ENVI is designed for extracting information from geospatial and medical imagery.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Location
20 Apr 2021NCBI Taxonomytaxonomy/fdb/taxonomy
19 Apr 2021BetacoronavirusBlast/fdb/blastdb/Betacoronavirus
14 Apr 2021NCBI nrBlast/fdb/blastdb/nr
14 Apr 2021Protein Data BankBlast/fdb/blastdb/pdbaa
14 Apr 2021SwissProtBlast/fdb/blastdb/swissprot
13 Apr 2021Protein Data BankBlast/fdb/blastdb/pdbnt
13 Apr 2021NCBI ntFasta/fdb/fastadb/nt.fas
13 Apr 2021NCBI nrFasta/fdb/fastadb/nr.aa.fas
13 Apr 2021SwissProtFasta/fdb/fastadb/swissprot.aa.fas
13 Apr 2021Protein Data BankFasta/fdb/fastadb/pdb.aa.fas
12 Apr 2021NCBI ntBlast/fdb/blastdb/nt
07 Feb 2021Human Genome hg18MySQLNIH mirror of UCSC Genome Browser