Biowulf High Performance Computing at the NIH
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
16 Nov 2018 Huygens updated to version 18.10.0-p1
Huygens is an image restoration, deconvolution, resolution and noise reduction. It can process images from all current optical microscopes, including wide-field, confocal, Nipkow (scanning disk confocal), multiple-photon, and 4Pi microscopes.
16 Nov 2018 STAR-Fusion updated to version 1.5.0
Transcript fusion detection
16 Nov 2018 mirge updated to version 2.0.5
A microRNA sequencing analysis tool.
16 Nov 2018 ctk updated to version 1.1.2
The CLIP Tool Kit (CTK) is a software package that provides a set of tools for analysis of CLIP data starting from the raw reads generated by the sequencer.
16 Nov 2018 STAR updated to version 2.6.1c
Spliced Transcripts Alignment to a Reference
15 Nov 2018 megahit updated to version 1.1.4
MEGAHIT is a single node assembler for large and complex metagenomics NGS reads, such as soil. It makes use of succinct de Bruijn graph (SdBG) to achieve low memory assembly. MEGAHIT can optionally utilize a CUDA-enabled GPU to accelerate its SdBG contstruction.
15 Nov 2018 magicblast updated to version 1.4.0
Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Each alignment optimizes a composite score, taking into account simultaneously the two reads of a pair, and in case of RNA-seq, locating the candidate introns and adding up the score of all exons. This is very different from other versions of BLAST, where each exon is scored as a separate hit and read-pairing is ignored.
13 Nov 2018 sve updated to version 0.1.0
SVE is a python script based execution engine for Structural Variation (SV) detection and can be used for any levels of data inputs, raw FASTQs, aligned BAMs, or variant call format (VCFs), and generates a unified VCF as its output.
13 Nov 2018 smoove updated to version 0.2.1
smoove simplifies and speeds calling and genotyping SVs for short reads. It also improves specificity by removing many spurious alignment signals that are indicative of low-level noise and often contribute to spurious calls.
13 Nov 2018 fmriprep updated to version 1.2.2
A Robust Preprocessing Pipeline for fMRI Data
9 Nov 2018 cellranger-atac updated to version 1.0.0
Cell Ranger ATAC is a set of analysis pipelines that process Chromium Single Cell ATAC data.
8 Nov 2018 mega2 updated to version 5.0.0
Mega2 is a data-handling program for facilitating genetic linkage and association analyses.
7 Nov 2018 EMIM updated to version 3.22
Estimation of Maternal, Imprinting and interaction effects using Multinomial modeling
6 Nov 2018 3DSlicer updated to version 4.10.0
A software platform for the analysis (including registration and interactive segmentation) and visualization (including volume rendering) of medical images and for research in image guided therapy.
6 Nov 2018 QoRTs updated to version 1.3.6
The QoRTs software package is a fast, efficient, and portable multifunction toolkit designed to assist in the analysis, quality control, and data management of RNA-Seq datasets.
5 Nov 2018 slamdunk updated to version 0.3.3
SLAMseq is a novel sequencing protocol that directly uncovers 4-thiouridine incorporation events in RNA by high-throughput sequencing. SlamDunk is a novel, fully automated software tool for automated, robust, scalable and reproducible SLAMseq data analysis.
5 Nov 2018 vdjtools updated to version 1.1.10
A comprehensive analysis framework for T-cell and B-cell repertoire sequencing data.
5 Nov 2018 singularity updated to version 3.0.1
Singularity is a container platform focused on supporting ``Mobility of Compute``. It allows users to emulate, and share custom Linux environments allowing for the creation of self-contained development stacks.
4 Nov 2018 metaphlan updated to version 2.7.8
MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.
2 Nov 2018 TORTOISE updated to version 3.1.3
(Tolerably Obsessive Registration and Tensor Optimization Indolent Software Ensemble) The TORTOISE software package is for processing diffusion MRI data.
2 Nov 2018 GATK updated to version
GATK, from the Broad Institute, is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.
1 Nov 2018 PEPATAC updated to version 0.8.3
PEPATAC is a robust ATAC-seq pipeline built on a loosely coupled modular framework. It may be easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects. It is optimized on unique features of ATAC-seq data to be fast and accurate and provides several unique analytical approaches.
1 Nov 2018 deepvariant updated to version 0.7.0
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
31 Oct 2018 tensorrt updated to version 18.09
NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms.
31 Oct 2018 ANTs updated to version 2.3.1
Advanced Normalization Tools (ANTs) extracts information from complex datasets that include imaging. Paired with ANTsR (answer), ANTs is useful for managing, interpreting and visualizing multidimensional data.
30 Oct 2018 sequenza-utils updated to version 2.2.0
Sequenza-utils is The supporting python library for the sequenza R package. Sequenza is a project the estimate purity/ploidy and copy number alteration from tumor sequencing experiments. Sequenza-utils provide command lines programs to transform common NGS file type, such as BAM, pileup and VCF, to input files for the R package
30 Oct 2018 cgpBattenberg updated to version 3.3.0
Detect subclonality and copy number in matched NGS data
30 Oct 2018 jellyfish updated to version 2.2.10
Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA.
30 Oct 2018 sv2 updated to version
Support Vector Structural Variation Genotyper
30 Oct 2018 prest updated to version 3.02
PREST is a program that detects pedigree errors by use of genome-screen data.
29 Oct 2018 Tybalt updated to version 0.1.3
Tybalt implements a Variational EutoEncoder (VAE), a deep neural network approach capable of generating meaningful latent spaces for image and text data. Tybalt has been trained on The Cancer Genome Atlas (TCGA) pan-cancer RNA-seq data and used to identify specific patterns in the VAE encoded features.
29 Oct 2018 Rosetta updated to version 2018.42
The Rosetta++ software suite can perform de novo protein structure predictions, identify low free energy sequences for target protein backbones, predict the structure of a protein-protein complex from the individual structures of the monomer components, incorporate NMR data into the basic Rosetta protocol to accelerate the process of NMR structure prediction, and more...
29 Oct 2018 bgionline updated to version 0.1
BGI Online is a cloud platform for bioinformatic analysis. The BGI online tools can be used for downloading data and more. Use 'module load bgionline' to access the tools. [Documentation]
26 Oct 2018 EDirect updated to version 10.0
Entrez Direct (EDirect) is an advanced method for accessing the NCBI's set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window.
24 Oct 2018 BEAST updated to version 1.10.2,2.5.1
BEAST (Bayesian Evolutionary Analysis Sampling Trees) is a cross-platform program for Bayesian MCMC analysis of molecular sequences.
24 Oct 2018 bismark updated to version 0.20.0
Bismark is a program to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step. The output can be easily imported into a genome viewer, such as SeqMonk, and enables a researcher to analyse the methylation levels of their samples straight away.
24 Oct 2018 crystfel updated to version 0.7.0
CrystFEL is a suite of programs for processing diffraction data acquired serially in a snapshot manner, such as when using the technique of Serial Femtosecond Crystallography (SFX) with a free-electron laser source.
23 Oct 2018 ldsc updated to version 1.0.0-101-g89c13a7
ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
23 Oct 2018 Matlab updated to version 2018b
MATLAB is a high-performance interactive software package for scientific and engineering numeric computation. MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an environment where problems and solutions are expressed just as they are written mathematically.
23 Oct 2018 ORP (Oyster River Protocol) updated to version 2.0.0
Oyster River Protocol (ORP) implements a standardized and benchmarked set of bioinformatic processes, resulting in a transcriptome assembly with enhanced qualities over other standard assembly methods. Specifically, ORP produced assemblies have higher Detonate and TransRate scores and mapping rates, which is largely a product of the fact that it leverages a multiassembler and kmer assembly process, thereby bypassing the shortcomings i of any one approach.
22 Oct 2018 mrtrix updated to version 3.0_RC3
MRtrix provides a large suite of tools for image processing, analysis and visualisation, with a focus on the analysis of white matter using diffusion-weighted MRI.
22 Oct 2018 Gromacs updated to version 2018.3+plumed2.5b
Gromacs is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.
22 Oct 2018 Mathematica updated to version 11.3
Mathematica is an interactive system for doing mathematical computation. It performs numerical, symbolic and graphical computations, and incorporates a high-level programming language.
19 Oct 2018 xpdf updated to version 4.00
Xpdf is a free PDF viewer and toolkit, including a text extractor, image converter, HTML converter, and more. Most of the tools are available as open source.
19 Oct 2018 dcm2niix updated to version 1.0.20180622
DICOM to NIfTI converter
18 Oct 2018 smrtanalysis updated to version
SMRT® Analysis is a bioinformatics software suite available for analysis of DNA sequencing data from Pacific Biosciences’ SMRT technology. Users can choose from a variety of analysis protocols that utilize PacBio® and third-party tools. Analysis protocols include de novo genome assembly, cDNA mapping, DNA base-modification detection, and long-amplicon analysis to determine phased consensus sequences.
18 Oct 2018 supernova updated to version 2.1.1
Supernova generates highly-contiguous, phased, whole-genome de novo assemblies from a Chromium-prepared library.
18 Oct 2018 salmon updated to version 0.11.3
a tool for quantifying the expression of transcripts using RNA-seq data.
18 Oct 2018 multiqc updated to version 1.6
aggregates results for various frequently used bioinformatics tools across multiple samples into a nice visual report
18 Oct 2018 flye updated to version 2.3.6
Fast and accurate de novo assembler for single molecule sequencing reads
18 Oct 2018 tvb updated to version 1.5.4
The Virtual Brain (TVB) scientific library has the purpose of offering modern tools to the Neurosciences community, for computing, simulating and analyzing functional and structural data of human brains
17 Oct 2018 Connectome Workbench updated to version 1.3.2
Tools to browse, download, explore, and analyze data from the Human Connectome Project (HCP). Allows users to compare their own data to that of the HCP.
17 Oct 2018 google-cloud-sdk updated to version 221.0.0
Google Cloud SDK is a set of tools that you can use to manage resources and applications hosted on Google Cloud Platform. These include the gcloud, gsutil, and bq command line tools. See docs at
Type 'module load google-cloud-sdk' to use on Biowulf.
17 Oct 2018 stow updated to version 2.2.2
GNU Stow is a symlink farm manager which takes distinct packages of software and/or data located in separate directories on the filesystem, and makes them appear to be installed in the same place.
17 Oct 2018 shmlast updated to version 1.2.1
shmlast is a reimplementation of the Conditional Reciprocal Best Hits algorithm for finding potential orthologs between a transcriptome and a species-specific protein database. It uses the LAST aligner and the pydata stack to achieve much better performance while staying in the Python ecosystem.
15 Oct 2018 DETONATE updated to version 1.11
DETONATE is a tool for evaluation of de novo transcriptome assemblies from RNA-Seq data. It consists of two component packages, RSEM-EVAL and REF-EVAL. RSEM-EVAL is a reference-free evaluation method based on a novel probabilistic model that depends only on an assembly and the RNA-Seq reads used for its construction. REF-EVAL is a toolkit of reference-based measures.
15 Oct 2018 neusomatic updated to version 0.1.1
NeuSomatic is based on deep convolutional neural networks for accurate somatic mutation detection. With properly trained models, it can robustly perform across sequencing platforms, strategies, and conditions. NeuSomatic summarizes and augments sequence alignments in a novel way and incorporates multi-dimensional features to capture variant signals effectively. It is not only a universal but also accurate somatic mutation detection method.
15 Oct 2018 Rcorrector updated to version
Rcorrector implements a k-mer based method to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which use a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read.
12 Oct 2018 vartrix updated to version 1.1.0
VarTrix is a software tool for extracting single cell variant information from 10x Genomics single cell data.
11 Oct 2018 minimap2 updated to version 2.13
Minimap2 is a fast sequence mapping and alignment program that can find overlaps between long noisy reads, or map long reads or their assemblies to a reference genome optionally with detailed alignment (i.e. CIGAR).
10 Oct 2018 golang updated to version 1.11.1
The Go programming language
10 Oct 2018 Comsol updated to version 54
The COMSOL Multiphysics engineering simulation software environment facilitates all steps in the modeling process − defining your geometry, meshing, specifying your physics, solving, and then visualizing your results.
9 Oct 2018 scanpy updated to version 1.3.2
Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.
9 Oct 2018 pysurfer updated to version 0.9.0
PySurfer is a Python library for visualizing brain surfaces produced by neuroimaging datasets.
9 Oct 2018 IGVTools updated to version 2.4.14
IGVTools provides utilities for working with ascii file formats used by the Integrated Genome Viewer. The files can be sorted, tiled, indexed, and counted.
9 Oct 2018 IGV updated to version 2.4.14
The Integrative Genomics Viewer is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.
3 Oct 2018 AMBER updated to version 18
AMBER (Assisted Model Building with Energy Refinement) is a package of molecular simulation programs.
3 Oct 2018 VEP updated to version 94
VEP (Variant Effect Predictor) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.
2 Oct 2018 ncbi-toolkit updated to version 21.0.0
The NCBI C++ Toolkit is a set of executables and libraries for a multitude of sequence analysis functions.
1 Oct 2018 manorm updated to version 1.1.4
MAnorm is for quantitative comparison of ChIP-Seq data sets describing transcription factor binding sites and epigenetic modifications. The quantitative binding differences inferred by MAnorm showed strong correlation with both the changes in expression of target genes and the binding of cell type-specific regulators.
1 Oct 2018 busco updated to version 3.0.2
BUSCO completeness assessments employ sets of Benchmarking Universal Single-Copy Orthologs from OrthoDB ( to provide quantitative measures of the completeness of genome assemblies, annotated gene sets, and transcriptomes in terms of expected gene content.
28 Sep 2018 transrate updated to version 1.0.3
Transrate is software for de-novo transcriptome assembly quality analysis.
26 Sep 2018 virtualgl updated to version 2.6
VirtualGL is an open source toolkit that gives any Unix or Linux remote display software the ability to run OpenGL applications with full 3D hardware acceleration.
26 Sep 2018 viper updated to version 0+20180706.git5915f6b
VIPER combines the use of several dozen RNA-seq tools, suites, and packages to create a complete pipeline that takes RNA-seq analysis from raw sequencing data all the way through alignment, quality control, unsupervised analyses, differential expression, and downstream pathway analysis
26 Sep 2018 DeconSeq updated to version 0.4.3
The DeconSeq tool can be used to automatically detect and efficiently remove sequence contamination from genomic and metagenomic datasets. It is easily configurable and provides a user-friendly interface.
26 Sep 2018 leafcutter updated to version 0.2.7
Leafcutter quantifies RNA splicing variation using short-read RNA-seq data. The core idea is to leverage spliced reads (reads that span an intron) to quantify (differential) intron usage across samples.
25 Sep 2018 PEER updated to version 1.3
PEER stands for probabilistic estimation of expression residuals. It is a collection of Bayesian approaches to infer hidden determinants and their effects from gene expression profiles using factor analysis methods.
25 Sep 2018 cutadapt updated to version 1.18
cutadapt removes adapter sequences from DNA high-throughput sequencing data. This is usually necessary when the read length of the machine is longer than the molecule that is sequenced, such as in microRNA data.
24 Sep 2018 albacore updated to version 2.3.3
ONT basecaller
24 Sep 2018 drompa updated to version 3.5.0
Peak-calling, Visualization, Normalization and QC for ChIP-seq analysis
24 Sep 2018 flashpca updated to version 2.0
FlashPCA performs fast principal component analysis (PCA) of single nucleotide polymorphism (SNP) data, similar to smartpca from EIGENSOFT ( and shellfish ( FlashPCA is based on the library.
24 Sep 2018 rmats updated to version 4.0.2
MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data.
20 Sep 2018 PRSice updated to version 2.1.3.beta
PRSice is a Polygenic Risk Score software for calculating, applying, evaluating and plotting the results of polygenic risk scores (PRS) analyses.
19 Sep 2018 encode-atac-seq-pipeline updated to version 1.0
This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq or DNase-seq data.
18 Sep 2018 cromwell updated to version 34
A Workflow Management System geared towards scientific workflows.
18 Sep 2018 Eagle updated to version 2.4
Eagle performs a reference-based haplotype phasing. It attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference onsortium; HRC) using a new data structure based on the positional Burrows-Wheeler transform.
18 Sep 2018 mriqc updated to version 0.14.2
MRIQC is an MRI quality control tool
17 Sep 2018 Julia updated to version 1.0.0
high level, dynamic language for technical computing
13 Sep 2018 fastqc updated to version 0.11.6
It provide quality control functions to next gen sequencing data.
13 Sep 2018 AnnotSV updated to version 1.1.1
AnnotSV is a program designed for annotating Structural Variations (SV). This tool compiles functionally, regulatory and clinically relevant information and aims at providing annotations useful to i) interpret SV potential pathogenicity and ii) filter out SV potential false positives.
12 Sep 2018 nodejs updated to version 8.12.0
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. module name: nodejs
7 Sep 2018 Canvas updated to version 1.38
Canvas is a tool for calling copy number variants (CNVs) from human DNA sequencing data.
7 Sep 2018 rockhopper updated to version 2.0.3
Rockhopper is a comprehensive and user-friendly system for computational analysis of bacterial RNA-seq data. As input, Rockhopper takes RNA sequencing reads output by high-throughput sequencing technology (FASTQ, QSEQ, FASTA, SAM, or BAM files)
7 Sep 2018 HLA-PRG-LA updated to version 0.85.45c4fea
Stands for HLA PRG, linear approximation. The basic idea is to seed graph alignments with linear alignments to the sequences that the graph consists of.
7 Sep 2018 RapMap updated to version 0.5.0
RapMap is a tool for rapid sensitive and accurate read mapping via quasi-mapping. It is capable of mapping sequencing reads to a target transcriptome substantially faster than existing alignment tools.
5 Sep 2018 eager updated to version 1.92
EAGER: efficient ancient genome reconstruction
5 Sep 2018 h5utils updated to version 1.13.1
h5utils is a set of utilities for visualization and conversion of scientific data in the free, portable HDF5 format. Type 'module load h5utils' to access the executables (e.g. h5topng)
5 Sep 2018 dotnet-sdk updated to version 2.1.301
Microsoft .NET SDK and runtime
29 Aug 2018 U-Net updated to version 20180704
U-Net is an image segmentation tool. It relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization.
29 Aug 2018 DeepLab updated to version 20180816
DeepLab is a Semantic Image Segmentation tool. It makes use of the Deep Convolutional Networks, Dilated (a.k.a. Atrous) Convolution, and Fully Connected Conditional Random Fields.
28 Aug 2018 DanQ updated to version 20180828
DanQ is a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
24 Aug 2018 DNAnexus updated to version 0.260.0
DNAnexus is a cloud-based commercial solution for next-generation sequence analysis and visualization. It has a command-line interface (CLI) which can be used to log in to the DNAnexus platform, upload and navigate data, and launch analyses.
23 Aug 2018 cmake updated to version 3.12.1
CMake is a family of tools designed to build, test and package software.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Location
19 Nov 2018NCBI Taxonomytaxonomy/fdb/taxonomy
18 Nov 201816S MicrobialBlast/fdb/blastdb/16SMicrobial
17 Nov 2018Protein Data BankPDB/pdb/pdb
16 Nov 2018Rat Genome (Rattus norvegicus) rn4MySQLNIH mirror of UCSC Genome Browser
14 Nov 2018Protein Data BankBlast/fdb/blastdb/pdbaa
14 Nov 2018SwissProtBlast/fdb/blastdb/swissprot
13 Nov 2018Protein Data BankBlast/fdb/blastdb/pdbnt
13 Nov 2018EST - humanFasta/fdb/fastadb/est_human.fas
13 Nov 2018Protein Data BankFasta/fdb/fastadb/pdb.nt.fas
13 Nov 2018NCBI ntFasta/fdb/fastadb/nt.fas
13 Nov 2018MitoFasta/fdb/fastadb/mito.nt.fas
13 Nov 2018SwissProtFasta/fdb/fastadb/swissprot.aa.fas
13 Nov 2018Protein Data BankFasta/fdb/fastadb/pdb.aa.fas
13 Nov 2018MitoFasta/fdb/fastadb/mito.aa.fas
13 Nov 2018NCBI nrFasta/fdb/fastadb/nr.aa.fas
12 Nov 2018NCBI ntBlast/fdb/blastdb/nt
12 Nov 2018ViralBlast/fdb/blastdb/viral
08 Nov 2018NCBI nrBlast/fdb/blastdb/nr
05 Nov 2018Human Genome hg19Fasta/fdb/genome/human-feb2009/
02 Nov 2018EST - othersBlast/fdb/blastdb/est_others
30 Oct 2018EST - humanBlast/fdb/blastdb/est_human
22 Oct 2018ANNOVARANNOVAR/fdb/annovar/current
16 Oct 2018MitoBlast/fdb/blastdb/mito.aa
12 Oct 2018Mouse Genome (Mus musculus) mm8MySQLNIH mirror of UCSC Genome Browser
12 Oct 2018Drosophila genome (Drosophila melanogaster) fb5MySQLNIH mirror of UCSC genome browser
02 Oct 2018HTGsBlast/fdb/blastdb/htgs
07 Sep 2018Rhesus genome rheMac2MySQLNIH mirror of UCSC genome browser
31 Aug 2018Dog Genome (Canis familiaris)MySQLNIH mirror of UCSC genome browser