Biowulf High Performance Computing at the NIH
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
17 Jan 2019 FuSeq updated to version 1.1.0
FuSeq is a software for discovering fusion genes from paired-end RNA sequencing data. It implements a fast and accurate method to discover fusion genes based on quasi-mapping to quickly map the reads, extract initial candidates from split reads and fusion equivalence classes of mapped reads, and finally apply multiple filters and statistical tests to get the final candidates.
16 Jan 2019 spark updated to version 2.4.0
Apache Spark is a fast and general engine for large-scale data processing. It is commonly used as an in-memory alternative to Hadoop MapReduce.
15 Jan 2019 xwas updated to version 3.0
XWAS (chromosome X-Wide Analysis toolSet) is a software suite for the analysis of the X chromosome in association analyses and similar studies.
14 Jan 2019 PartekFlow updated to version
Web interface designed specifically for the analysis needs of next generation sequencing applications including RNA, small RNA, and DNA sequencing.
14 Jan 2019 RefinedIBD updated to version 12Jul18
Refined IBD is a software package that detects identity-by-descent segments in phased genotype data. It achieves both computational efficiency and highly accurate IBD segment reporting by searching for IBD in two steps. The first step (identification) uses the GERMLINE algorithm to find shared haplotypes exceeding a length threshold. The second step (refinement) evaluates candidate segments with a probabilistic approach to assess the evidence for IBD.
10 Jan 2019 SimNIBS updated to version 2.1.2
SimNIBS is a free software package for the Simulation of Non-invasive Brain Stimulation. It allows for realistic calculations of the electric field induced by transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS).
10 Jan 2019 snakemake updated to version 5.4.0
Snakemake aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style. It is well suited for bioinformatic workflows.
10 Jan 2019 dcm2niix updated to version 1.0.20181125
DICOM to NIfTI converter
10 Jan 2019 MashMap updated to version 2.0
MashMap is an approximate algorithm for computing local alignment boundaries between long DNA sequences. Given a minimum alignment length and an identity threshold, it computes the desired alignment boundaries and identity estimates using kmer-based statistics, and maintains sufficient probabilistic guarantees on the output sensitivity.
10 Jan 2019 vartrix updated to version 1.1.1
VarTrix is a software tool for extracting single cell variant information from 10x Genomics single cell data.
9 Jan 2019 nvchecker updated to version 1.3
nvchecker (short for new version checker) is for checking if a new version of some software has been released.
7 Jan 2019 SPHIRE updated to version 1.1
SPHIRE (SPARX for High-Resolution Electron Microscopy) is an open-source, user-friendly software suite for the semi-automated processing of single particle electron cryo-microscopy (cryo-EM) data. It allows fast and reproducible structure determination from cryo-EM images.
7 Jan 2019 singularity updated to version 3.0.2
Singularity is a container platform focused on supporting ``Mobility of Compute``. It allows users to emulate, and share custom Linux environments allowing for the creation of self-contained development stacks.
4 Jan 2019 iqtree updated to version 1.6.9
4 Jan 2019 Qt updated to version 5.12.0
Qt is a cross-platform application framework that is used for developing application software that can be run on various software and hardware platforms with little or no change in the underlying codebase, while still being a native application with native capabilities and speed.
3 Jan 2019 ChromHMM updated to version 1.18
ChromHMM is software for learning and characterizing chromatin states.
3 Jan 2019 pyDNase updated to version 0.3.0
pyDNase is a suite of tools for analysing DNase-seq data - pyDNase comes with several analysis scripts covering several common use cases of DNase-seq analysis, and also an implementation of the Wellington, Wellington 1D, and Wellington-boostrap footprinting algorithms.
3 Jan 2019 mdtraj updated to version 1.9.2
MDTraj is a python library that allows users to manipulate molecular dynamics (MD) trajectories and perform a variety of analyses, including fast RMSD, solvent accessible surface area, hydrogen bonding, etc.
3 Jan 2019 parallel updated to version 20181222
GNU parallel is a shell tool for executing jobs in parallel using one or more computers.
3 Jan 2019 gdc-client updated to version 1.4.0
The GDC Data Transfer Tool provides an optimized method of transferring data to and from the GDC, and enables resumption of interrupted transfers.
28 Dec 2018 lbzip2 updated to version 2.5
lbzip2 is a multi-threaded compression utility with support for bzip2 compressed file format.
28 Dec 2018 cellranger updated to version 3.0.1
Cell Ranger is a set of analysis pipelines that processes Chromium single cell 3’ RNA-seq output to align reads, generate gene-cell matrices and perform clustering and gene expression analysis.
26 Dec 2018 macs updated to version 2.1.2
Model-based Analysis of ChIP-Seq (MACS) on short reads sequencers such as Genome Analyzer (Illumina / Solexa). MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction.
26 Dec 2018 DeepCyTOF updated to version 20170315
DeepCyTOF is a standardization approach for cell gating, based on deep learning techniques applied to mass cytometry, an emerging technology for high-dimensional multiparameter single cell analysis that overcomes many limitations of fluorescence-based flow cytometry. DeepCyTOF is based on domain adaptation principles and is a generalization of previous work that allows us to calibrate between a target distribution and a source distribution in an unsupervised manner.
23 Dec 2018 globus_sdk updated to version 1.7
Pythonic interface to Globus REST APIs, including the Transfer API and the Globus Auth API
21 Dec 2018 caveman updated to version 1.13.9
SNV expectation maximisation based mutation calling algorithm aimed at detecting somatic mutations in paired (tumour/normal) cancer samples
21 Dec 2018 hicpro updated to version 2.11.1
HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
20 Dec 2018 PRSice updated to version 2.1.4
PRSice is a Polygenic Risk Score software for calculating, applying, evaluating and plotting the results of polygenic risk scores (PRS) analyses.
20 Dec 2018 GATK updated to version
GATK, from the Broad Institute, is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.
19 Dec 2018 neuroconstruct updated to version 1.7.2
Biophysical Neural Network Modeling Software.
19 Dec 2018 Scipion updated to version 1.2.1
Scipion is an image processing framework to obtain 3D models of macromolecular complexes using Electron Microscopy (3DEM). It integrates several software packages and presents an unified interface for both biologists and developers. Scipion allows to execute workflows combining different software tools, while taking care of formats and conversions. Additionally, all steps are tracked and can be reproduced later on.
19 Dec 2018 bowtie2 updated to version
A version of bowtie that's particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes
17 Dec 2018 IGVTools updated to version 2.4.16
IGVTools provides utilities for working with ascii file formats used by the Integrated Genome Viewer. The files can be sorted, tiled, indexed, and counted.
17 Dec 2018 IGV updated to version 2.4.16
The Integrative Genomics Viewer is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.
17 Dec 2018 BEAST updated to version 1.10.4,2.5.1
BEAST (Bayesian Evolutionary Analysis Sampling Trees) is a cross-platform program for Bayesian MCMC analysis of molecular sequences.
13 Dec 2018 delly updated to version 0.7.9
DELLY is an integrated structural variant prediction method that can detect deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends and split-reads to sensitively and accurately delineate genomic rearrangements throughout the genome.
12 Dec 2018 AncestryMap updated to version 6210
AncestryMap is a software package that allows finding skews in ancestry that are potentially associated with disease genes in recently mixed populations.
12 Dec 2018 bamUtil updated to version 1.0.14
bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single executable, bam.
11 Dec 2018 viper updated to version 0-20181127-b74e0bc-p1
VIPER combines the use of several dozen RNA-seq tools, suites, and packages to create a complete pipeline that takes RNA-seq analysis from raw sequencing data all the way through alignment, quality control, unsupervised analyses, differential expression, and downstream pathway analysis
11 Dec 2018 mafft updated to version 7.407
Multiple alignment program for amino acid or nucleotide sequences
11 Dec 2018 novocraft updated to version 3.09.01
Package includes aligner for single-ended and paired-end reads from the Illumina Genome Analyser. Novoalign finds global optimum alignments using full Needleman-Wunsch algorithm with affine gap penalties.
10 Dec 2018 sniffles updated to version 1.0.10
Sniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). It detects all types of SVs (10bp+) using evidence from split-read alignments, high-mismatch regions, and coverage analysis.
7 Dec 2018 bbcp updated to version
Secure and fast copy utility
6 Dec 2018 MAGeCK updated to version 0.5.7
MAGeCK is Model-based Analysis of Genome-wide CRISPR/Cas9 Knockout (MAGeCK) method for prioritizing single-guide RNAs, genes and pathways in genome-scale CRISPR/Cas9 knockout screens. It demonstrates better performance compared with other methods, identifies both positively and negatively selected genes simultaneously, and reports robust results across different experimental conditions.
6 Dec 2018 pvactools updated to version 1.1.4
pVACtools is a cancer immunotherapy suite consisting of pVACseq, pVACfuse, pVACvector
5 Dec 2018 fmriprep updated to version 1.2.5
A Robust Preprocessing Pipeline for fMRI Data
5 Dec 2018 SURVIVOR updated to version 1.0.5
SURVIVOR is a tool set for simulating/evaluating SVs, merging and comparing SVs within and among samples, and includes various methods to reformat or summarize SVs.
4 Dec 2018 GAMESS updated to version 30Sep18-R3-sockets
GAMESS is a general ab initio quantum chemistry package.
3 Dec 2018 CCP4 updated to version 7.0.066
CCP4 is a suite of programs for protein crystallography and structural biology.
3 Dec 2018 hgvs updated to version 1.2.4
The hgvs package provides a Python library to facilitate the use of genome, transcript, and protein variants that are represented using the Human Genome Variation Society (varnomen) recommendations. To use, type module load hgvs prior to calling python.
30 Nov 2018 flappie updated to version 1.0.0
Basecall Fast5 reads using flip-flop basecalling.
30 Nov 2018 EMAN2 updated to version 2.22
EMAN2 is a broadly based greyscale scientific image processing suite with a primary focus on processing data from transmission electron microscopes.
29 Nov 2018 vcf2db updated to version 2018.10.26
vcf2db creates a gemini-compatible database from a VCF.
27 Nov 2018 GNU Scientific Library (GSL) updated to version 2.5
The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.
27 Nov 2018 trinity updated to version 2.8.4
Trinity, developed at the Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.
26 Nov 2018 Beagle updated to version 5.0_28Sep18
Beagle is a package for imputing genotypes, inferring haplotype phase, and performing genetic association analysis. BEAGLE is designed to analyze large-scale data sets with hundreds of thousands of markers genotyped on thousands of samples.
22 Nov 2018 PEPATAC updated to version 0.8.3
PEPATAC is a robust pipeline for Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) built on a loosely coupled modular framework. It may be easily applied to ATAC-seq projects of any size, from one-off experiments to large-scale sequencing projects. It is optimized on unique features of ATAC-seq data to be fast and accurate and provides several unique analytical approaches.
19 Nov 2018 FSL updated to version 6.0.0
FSL is a comprehensive library of image analysis and statistical tools for FMRI, MRI and DTI brain imaging data.
19 Nov 2018 Clinker updated to version 1.32
Clinker is a bioinformatics pipeline that generates a superTranscriptome from popular fusion finder outputs (JAFFA, tophatFusion, SOAP, deFUSE, Pizzly, etc), that can be then be either viewed in genome viewers such as IGV or through the included plotting feature developed with GViz.
16 Nov 2018 Huygens updated to version 18.10.0-p1
Huygens is an image restoration, deconvolution, resolution and noise reduction. It can process images from all current optical microscopes, including wide-field, confocal, Nipkow (scanning disk confocal), multiple-photon, and 4Pi microscopes.
16 Nov 2018 STAR-Fusion updated to version 1.5.0
Transcript fusion detection
16 Nov 2018 mirge updated to version 2.0.5
A microRNA sequencing analysis tool.
16 Nov 2018 ctk updated to version 1.1.2
The CLIP Tool Kit (CTK) is a software package that provides a set of tools for analysis of CLIP data starting from the raw reads generated by the sequencer.
16 Nov 2018 STAR updated to version 2.6.1c
Spliced Transcripts Alignment to a Reference
15 Nov 2018 megahit updated to version 1.1.4
MEGAHIT is a single node assembler for large and complex metagenomics NGS reads, such as soil. It makes use of succinct de Bruijn graph (SdBG) to achieve low memory assembly. MEGAHIT can optionally utilize a CUDA-enabled GPU to accelerate its SdBG contstruction.
15 Nov 2018 magicblast updated to version 1.4.0
Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Each alignment optimizes a composite score, taking into account simultaneously the two reads of a pair, and in case of RNA-seq, locating the candidate introns and adding up the score of all exons. This is very different from other versions of BLAST, where each exon is scored as a separate hit and read-pairing is ignored.
13 Nov 2018 sve updated to version 0.1.0
SVE is a python script based execution engine for Structural Variation (SV) detection and can be used for any levels of data inputs, raw FASTQs, aligned BAMs, or variant call format (VCFs), and generates a unified VCF as its output.
13 Nov 2018 smoove updated to version 0.2.1
smoove simplifies and speeds calling and genotyping SVs for short reads. It also improves specificity by removing many spurious alignment signals that are indicative of low-level noise and often contribute to spurious calls.
9 Nov 2018 cellranger-atac updated to version 1.0.0
Cell Ranger ATAC is a set of analysis pipelines that process Chromium Single Cell ATAC data.
8 Nov 2018 mega2 updated to version 5.0.0
Mega2 is a data-handling program for facilitating genetic linkage and association analyses.
7 Nov 2018 EMIM updated to version 3.22
Estimation of Maternal, Imprinting and interaction effects using Multinomial modeling
6 Nov 2018 3DSlicer updated to version 4.10.0
A software platform for the analysis (including registration and interactive segmentation) and visualization (including volume rendering) of medical images and for research in image guided therapy.
6 Nov 2018 QoRTs updated to version 1.3.6
The QoRTs software package is a fast, efficient, and portable multifunction toolkit designed to assist in the analysis, quality control, and data management of RNA-Seq datasets.
5 Nov 2018 slamdunk updated to version 0.3.3
SLAMseq is a novel sequencing protocol that directly uncovers 4-thiouridine incorporation events in RNA by high-throughput sequencing. SlamDunk is a novel, fully automated software tool for automated, robust, scalable and reproducible SLAMseq data analysis.
5 Nov 2018 vdjtools updated to version 1.1.10
A comprehensive analysis framework for T-cell and B-cell repertoire sequencing data.
4 Nov 2018 metaphlan updated to version 2.7.8
MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.
2 Nov 2018 TORTOISE updated to version 3.1.3
(Tolerably Obsessive Registration and Tensor Optimization Indolent Software Ensemble) The TORTOISE software package is for processing diffusion MRI data.
1 Nov 2018 deepvariant updated to version 0.7.0
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
31 Oct 2018 tensorrt updated to version 18.09
NVIDIA TensorRTâ„¢ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms.
31 Oct 2018 ANTs updated to version 2.3.1
Advanced Normalization Tools (ANTs) extracts information from complex datasets that include imaging. Paired with ANTsR (answer), ANTs is useful for managing, interpreting and visualizing multidimensional data.
30 Oct 2018 sequenza-utils updated to version 2.2.0
Sequenza-utils is The supporting python library for the sequenza R package. Sequenza is a project the estimate purity/ploidy and copy number alteration from tumor sequencing experiments. Sequenza-utils provide command lines programs to transform common NGS file type, such as BAM, pileup and VCF, to input files for the R package
30 Oct 2018 cgpBattenberg updated to version 3.3.0
Detect subclonality and copy number in matched NGS data
30 Oct 2018 jellyfish updated to version 2.2.10
Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA.
30 Oct 2018 sv2 updated to version
Support Vector Structural Variation Genotyper
30 Oct 2018 prest updated to version 3.02
PREST is a program that detects pedigree errors by use of genome-screen data.
29 Oct 2018 Tybalt updated to version 0.1.3
Tybalt implements a Variational EutoEncoder (VAE), a deep neural network approach capable of generating meaningful latent spaces for image and text data. Tybalt has been trained on The Cancer Genome Atlas (TCGA) pan-cancer RNA-seq data and used to identify specific patterns in the VAE encoded features.
29 Oct 2018 Rosetta updated to version 2018.42
The Rosetta++ software suite can perform de novo protein structure predictions, identify low free energy sequences for target protein backbones, predict the structure of a protein-protein complex from the individual structures of the monomer components, incorporate NMR data into the basic Rosetta protocol to accelerate the process of NMR structure prediction, and more...
29 Oct 2018 bgionline updated to version 0.1
BGI Online is a cloud platform for bioinformatic analysis. The BGI online tools can be used for downloading data and more. Use 'module load bgionline' to access the tools. [Documentation]
26 Oct 2018 EDirect updated to version 10.0
Entrez Direct (EDirect) is an advanced method for accessing the NCBI's set of interconnected databases (publication, sequence, structure, gene, variation, expression, etc.) from a UNIX terminal window.
24 Oct 2018 bismark updated to version 0.20.0
Bismark is a program to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step. The output can be easily imported into a genome viewer, such as SeqMonk, and enables a researcher to analyse the methylation levels of their samples straight away.
24 Oct 2018 crystfel updated to version 0.8.0
CrystFEL is a suite of programs for processing diffraction data acquired serially in a snapshot manner, such as when using the technique of Serial Femtosecond Crystallography (SFX) with a free-electron laser source.
23 Oct 2018 ldsc updated to version 1.0.0-101-g89c13a7
ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
23 Oct 2018 Matlab updated to version 2018b
MATLAB is a high-performance interactive software package for scientific and engineering numeric computation. MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an environment where problems and solutions are expressed just as they are written mathematically.
23 Oct 2018 ORP (Oyster River Protocol) updated to version 2.0.0
Oyster River Protocol (ORP) implements a standardized and benchmarked set of bioinformatic processes, resulting in a transcriptome assembly with enhanced qualities over other standard assembly methods. Specifically, ORP produced assemblies have higher Detonate and TransRate scores and mapping rates, which is largely a product of the fact that it leverages a multiassembler and kmer assembly process, thereby bypassing the shortcomings i of any one approach.
22 Oct 2018 mrtrix updated to version 3.0_RC3
MRtrix provides a large suite of tools for image processing, analysis and visualisation, with a focus on the analysis of white matter using diffusion-weighted MRI.
22 Oct 2018 Gromacs updated to version 2018.3+plumed2.5b
Gromacs is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.
22 Oct 2018 Mathematica updated to version 11.3
Mathematica is an interactive system for doing mathematical computation. It performs numerical, symbolic and graphical computations, and incorporates a high-level programming language.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Location
15 Jan 2019Protein Data BankFasta/fdb/fastadb/pdb.nt.fas
15 Jan 2019NCBI ntFasta/fdb/fastadb/nt.fas
15 Jan 2019MitoFasta/fdb/fastadb/mito.nt.fas
15 Jan 2019SwissProtFasta/fdb/fastadb/swissprot.aa.fas
15 Jan 2019Protein Data BankFasta/fdb/fastadb/pdb.aa.fas
15 Jan 2019MitoFasta/fdb/fastadb/mito.aa.fas
15 Jan 2019NCBI nrFasta/fdb/fastadb/nr.aa.fas
15 Jan 2019NCBI Taxonomytaxonomy/fdb/taxonomy
14 Jan 2019ViralBlast/fdb/blastdb/viral
14 Jan 2019Protein Data BankBlast/fdb/blastdb/pdbaa
14 Jan 2019SwissProtBlast/fdb/blastdb/swissprot
13 Jan 201916S MicrobialBlast/fdb/blastdb/16SMicrobial
12 Jan 2019Protein Data BankBlast/fdb/blastdb/pdbnt
12 Jan 2019Protein Data BankPDB/pdb/pdb
11 Jan 2019Rat Genome (Rattus norvegicus) rn4MySQLNIH mirror of UCSC Genome Browser
10 Jan 2019NCBI nrBlast/fdb/blastdb/nr
05 Jan 2019NCBI ntBlast/fdb/blastdb/nt
28 Dec 2018Mouse Genome (Mus musculus) mm8MySQLNIH mirror of UCSC Genome Browser
26 Dec 2018ViralBlast/fdb/blastdb/viral
25 Dec 2018EST - mouseFasta/fdb/fastadb/est_mouse.fas
25 Dec 2018EST - humanFasta/fdb/fastadb/est_human.fas
19 Dec 2018ANNOVARANNOVAR/fdb/annovar/current
12 Dec 2018EST - mouseBlast/fdb/blastdb/est_mouse
06 Dec 2018TCGA DREAM SMC synthetic dataBAM/fdb/DREAM/SMC
04 Dec 2018EST - othersBlast/fdb/blastdb/est_others
30 Nov 2018Drosophila genome (Drosophila melanogaster) fb5MySQLNIH mirror of UCSC genome browser
20 Nov 2018EST - humanBlast/fdb/blastdb/est_human
05 Nov 2018Human Genome hg19Fasta/fdb/genome/human-feb2009/