High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
15 Dec 2017 seqtk updated to version 1.2-r94
seqtk is a toolkit for processing sequences in FASTA/Q formats
15 Dec 2017 lofreq updated to version
LoFreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data.
14 Dec 2017 Canvas updated to version 1.31
Canvas is a tool for calling copy number variants (CNVs) from human DNA sequencing data.
13 Dec 2017 osprey updated to version 2.2~beta
OSPREY is a suite of programs for computational structure-based protein design. OSPREY is specifically designed to identify protein mutants that possess desired target properties (e.g., improved stability, switch of substrate specificity, etc.). OSPREY can also be used for predicting small-molecule drug inhibitors and for designing protein-protein and protein-peptide interactions.
13 Dec 2017 Xplor-NIH updated to version 2.46
Xplor-NIH is a structure determination program which builds on the X-PLOR v3.851 program, including additional tools developed at the NIH.
11 Dec 2017 CLONET updated to version 20171016
CLONET is a collection of R scripts that allows: computing global DNA admixture (1-purity) and ploidy of tumor DNA samples (each with matched normal sample) from sequencing data (WGS, WES, targeted) computing clonality of each somatic aberration, including somatic copy number aberrations, point mutations, and structural rearrangements nominatig the temporal relation among somatic aberrations and building evolution maps
11 Dec 2017 vcf2maf updated to version 1.6.14
A smarter, more reproducible, and more configurable tool for converting a VCF to a MAF.
7 Dec 2017 naccess updated to version 2.1.1
The naccess program calculates the atomic accessible surface defined by rolling a probe of given size around a van der Waals surface.
5 Dec 2017 samtools updated to version 1.6
The samtools package now provides samtools, bcftools, tabix, and the underlying htslib library.
4 Dec 2017 bcbio-nextgen updated to version 1.0.6
Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis
27 Nov 2017 schism updated to version 1.1.3
Subclonal Hierarchy Inference from Somatic Mutations
27 Nov 2017 Chimera updated to version 1.12.0
Chimera is a highly extensible program for interactive visualization and analysis of molecular structures and related data, including density maps, supramolecular assemblies, sequence alignments, docking results, trajectories, and conformational ensembles.
21 Nov 2017 TORTOISE updated to version 3.1.0
(Tolerably Obsessive Registration and Tensor Optimization Indolent Software Ensemble) The TORTOISE software package is for processing diffusion MRI data.
20 Nov 2017 NWChem updated to version 6.6
NWChem is an open source computational chemistry package that includes scalable tools for both classical and ab initio molecular simulations.
20 Nov 2017 exceRpt updated to version 4.4.0
The extra-cellular RNA processing toolkit (exceRpt) was designed to handle the variable contamination and often poor quality data obtained from low input smallRNA-seq samples such as those obtained from extra-cellular preparations. However the tool is perfectly capable of processing data from more standard cellular preparations and, with minor modifications to the command-line call, is also capable of processing WGS/exome and long RNA-seq data.
20 Nov 2017 golang updated to version 1.9.2
The Go programming language
17 Nov 2017 pandoc updated to version 2.0.2
Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.
17 Nov 2017 bcl2fastq updated to version 2.20.0
a tool to handle bcl conversion and demultiplexing
17 Nov 2017 kraken updated to version 1.0
Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies
14 Nov 2017 cnvkit updated to version 0.9.1
Copy number variant detection from targeted DNA sequencing
14 Nov 2017 kaiju updated to version 1.5.0
Kaiju is a program for the taxonomic classification of high-throughput sequencing reads, e.g., Illumina or Roche/454, from whole-genome sequencing of metagenomic DNA.
9 Nov 2017 minimap2 updated to version 2.4
Minimap2 is a fast sequence mapping and alignment program that can find overlaps between long noisy reads, or map long reads or their assemblies to a reference genome optionally with detailed alignment (i.e. CIGAR).
9 Nov 2017 Edena updated to version V3.131028
Edena v3: de novo short reads assembler.
7 Nov 2017 seqmix updated to version 0.1
SEQMIX is a program that takes advantage of off-targeted sequence reads from exome/targeted sequencing experiments for accurate local ancestry inference.
7 Nov 2017 chipseq_pipeline updated to version 0.3.3
AQUAS Transcription Factor and Histone ChIP-Seq processing pipeline. The AQUAS pipeline is based off the ENCODE (phase-3) transcription factor and histone ChIP-seq pipeline specifications (by Anshul Kundaje)
7 Nov 2017 Basset updated to version 0.1.0
Deep convolutional neural networks for DNA sequence analysis.
7 Nov 2017 Hail updated to version 0.1
Hail is an open-source, scalable framework for exploring and analyzing genomic data.
7 Nov 2017 ngsplot updated to version 2.63
ngsplot is an easy-to-use global visualization tool for next-generation sequencing data.
7 Nov 2017 QoRTs updated to version 1.3.0
The QoRTs software package is a fast, efficient, and portable multifunction toolkit designed to assist in the analysis, quality control, and data management of RNA-Seq datasets.
6 Nov 2017 QIIME updated to version 2.2017.10
QIIME is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rRNA) generated on a variety of platforms, but also supporting analysis of other types of data (such as shotgun metagenomic data).
2 Nov 2017 PartekFlow updated to version
Web interface designed specifically for the analysis needs of next generation sequencing applications including RNA, small RNA, and DNA sequencing.
1 Nov 2017 AutodockVina updated to version 1_1_2
AutoDock Vina is a program for drug discovery, molecular docking and virtual screening, offering multi-core capability, high performance and enhanced accuracy and ease of use. It is closely tied to Autodock.
1 Nov 2017 Autodock updated to version 4.2.6
Autodock is a suite of automated docking tools. It is designed to predict how small molecules, such as substrates or drug candidates, bind to a receptor of known 3D structure.
1 Nov 2017 ANTs updated to version 1Nov2017
Advanced Normalization Tools (ANTs) extracts information from complex datasets that include imaging. Paired with ANTsR (answer), ANTs is useful for managing, interpreting and visualizing multidimensional data.
1 Nov 2017 AFNI updated to version current
AFNI (Analysis of Functional NeuroImages) is a set of C programs for processing, analyzing, and displaying functional MRI (FMRI) data - a technique for mapping human brain activity.
31 Oct 2017 MotionCor2 updated to version 1.0.1
MotionCor2 is a multi-GPU accelerated program that provides iterative, patch-based motion detection combining spatial and temporal constraints and dose weighting for both single particle and tomographic cryo-electon microscopy images.
31 Oct 2017 EMIM updated to version 3.22
Estimation of Maternal, Imprinting and interaction effects using Multinomial modeling
25 Oct 2017 Genome Browser updated to version 356
The Genome Browser Mirror Fragments at Helix Systems is a mirror of the UCSC Genome Browser. The URL is https://hpcnihapps.cit.nih.gov/genome. Users can also access the MySQL databases, supporting files directly, and a huge number of associated executables.
24 Oct 2017 ghmm updated to version 2341
The General Hidden Markov Model library (GHMM) is a freely available C library implementing efficient data structures and algorithms for basic and extended HMMs with discrete and continuous emissions. It comes with Python wrappers which provide a much nicer interface and added functionality.
24 Oct 2017 RNAshapes updated to version 2.1.6
RNAshape abstraction maps structures to a tree-like domain of shapes, retaining adjacency and nesting of structural features, but disregarding helix lengths.
24 Oct 2017 mash updated to version 2.0
mash is a command line tool and library to provide fast genome and metagenome distance estimation using MinHash. Only command line tool is installed
24 Oct 2017 ssHMM updated to version 1.0.2
ssHMM is an RNA motif finder. It recovers sequence-structure motifs from RNA-binding protein data, such as CLIP-Seq data.
24 Oct 2017 Juicer updated to version 1.5.6
A One-Click System for Analyzing Loop-Resolution Hi-C Experiments
23 Oct 2017 atac_dnase_pipelines updated to version 0.3.4-19-gcbd2a00
This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq or DNase-seq data
23 Oct 2017 mafft updated to version 7.312
Multiple alignment program for amino acid or nucleotide sequences
21 Oct 2017 dcm2niix updated to version 1.0.20171017
DICOM to NIfTI converter
20 Oct 2017 synapseclient updated to version 1.7.2
The synapseclient package provides an interface to Synapse, a collaborative workspace for reproducible, data intensive research projects
20 Oct 2017 umitools updated to version 0.5.1
tools for dealing with Unique Molecular Identifiers (UMIs)/Random Molecular Tags (RMTs) and single cell RNA-Seq cell barcodes
19 Oct 2017 RNAstructure updated to version 6.0
RNAstructure is a complete package for RNA and DNA secondary structure prediction and analysis. It includes algorithms for secondary structure prediction, including facility to predict base pairing probabilities. It also can be used to predict bimolecular structures and can predict the equilibrium binding affinity of an oligonucleotide to a structured RNA target.
19 Oct 2017 trust updated to version 2.4.1
trust analyze TCR sequences using unselected RNA sequencing data, profiled from solid tissues, including tumors
19 Oct 2017 rtg updated to version 3.8.4
variant detection for singletons, families, large pedigrees and populations, cancer, structural variant and CNV analysis, and microbial and metagenomic analysis
19 Oct 2017 hichipper updated to version 0.7.0
hichipper is a preprocessing and QC pipeline for HiChIP data. This package takes output from a HiC-Pro run and a sample manifest file (.yaml) that coordinates optional high-quality peaks (identified through ChIP-Seq) and restriction fragment locations (see folder here) as input and produces output that can be used to 1) determine library quality, 2) identify and characterize DNA loops and 3) interactively visualize loops.
18 Oct 2017 Matlab updated to version
MATLAB is a high-performance interactive software package for scientific and engineering numeric computation. MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an environment where problems and solutions are expressed just as they are written mathematically.
17 Oct 2017 primer3 updated to version 2.3.7
Primer3 is a program for designing PCR primers, hybridization probes, and sequencing primers.
17 Oct 2017 crystfel updated to version 0.6.3
CrystFEL is a suite of programs for processing diffraction data acquired serially in a snapshot manner, such as when using the technique of Serial Femtosecond Crystallography (SFX) with a free-electron laser source.
16 Oct 2017 novocraft updated to version 3.08.02
Package includes aligner for single-ended and paired-end reads from the Illumina Genome Analyser. Novoalign finds global optimum alignments using full Needleman-Wunsch algorithm with affine gap penalties.
13 Oct 2017 singularity updated to version 2.4
Singularity is a container platform focused on supporting ``Mobility of Compute``. It allows users to emulate, and share custom Linux environments allowing for the creation of self-contained development stacks.
10 Oct 2017 hgvs updated to version 1.1.0
The hgvs package provides a Python library to facilitate the use of genome, transcript, and protein variants that are represented using the Human Genome Variation Society (varnomen) recommendations. To use, type module load hgvs prior to calling python.
7 Oct 2017 pyDNase updated to version 0.2.5
pyDNase is a suite of tools for analysing DNase-seq data - pyDNase comes with several analysis scripts covering several common use cases of DNase-seq analysis, and also an implementation of the Wellington, Wellington 1D, and Wellington-boostrap footprinting algorithms.
7 Oct 2017 manta updated to version 1.2.0
Structural variant and indel caller for mapped sequencing data
7 Oct 2017 mapDamage updated to version 2.0.8
mapDamage profiles DNA damage patterns in next-generation sequencing analyses of ancient DNA samples.
7 Oct 2017 bali-phy updated to version 3.0-beta3
BAli-Phy is MCMC software developed by Ben Redelings with Marc Suchard for simultaneous Bayesian estimation of alignment and phylogeny (and other parameters). It handles generic Bayesian modeling via probabilistic programming.
7 Oct 2017 clark updated to version
A method based on a supervised sequence classification using discriminative k-mers
6 Oct 2017 exomiser updated to version 8.0.1
The Exomiser is a Java program that functionally annotates variants from whole-exome sequencing data starting from a VCF file.
6 Oct 2017 MAJIQ updated to version 1.0.5
Modeling Alternative Junction Inclusion Quantification. MAJIQ and Voila are two software packages that together define, quantify, and visualize local splicing variations (LSV) from RNA-Seq data.
6 Oct 2017 hotnet2 updated to version 1.0.1-125-g29fe555
HotNet2 is an algorithm for finding significantly altered subnetworks in a large gene interaction network.
6 Oct 2017 annogesic updated to version 0.6.25
ANNOgesic is a transcriptome annotation pipeline for RNA-seq.
5 Oct 2017 Rosetta updated to version 2017.36
The Rosetta++ software suite can perform de novo protein structure predictions, identify low free energy sequences for target protein backbones, predict the structure of a protein-protein complex from the individual structures of the monomer components, incorporate NMR data into the basic Rosetta protocol to accelerate the process of NMR structure prediction, and more...
5 Oct 2017 hicpro updated to version 2.9.0
HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
5 Oct 2017 ncbi-toolkit updated to version 18.0.0
The NCBI C++ Toolkit is a set of executables and libraries for a multitude of sequence analysis functions.
29 Sep 2017 snakemake updated to version 4.1.0
Snakemake aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style. It is well suited for bioinformatic workflows.
26 Sep 2017 maker updated to version 2.31.9
MAKER is an easy-to-configure, portable genome annotation pipeline.
26 Sep 2017 boost updated to version 1.65
Boost provides free peer-reviewed portable C++ source libraries. Boost libraries are intended to be widely useful, and usable across a broad spectrum of applications.
26 Sep 2017 miarma updated to version 1.7.1
miARma-Seq, which stands for miRNA-Seq And RNA-Seq Multiprocess Analysis, is a suite designed to study mRNAs, miRNAs and circRNAs.
25 Sep 2017 CAVIAR updated to version a97e614
CAVIAR (CAusal Variants Identication in Associated Regions) is a statistical framework that quantifies the probability of each variant to be causal while allowing with arbitrary number of causal variants
25 Sep 2017 PAINTOR updated to version 3.0-2c614ef
PAINTOR (Probabilistic Annotation INtegraTOR) is a probabilistic framework that integrates association strength with genomic functional annotation data to improve accuracy in selecting plausible causal variants for functional validation.
22 Sep 2017 agfusion updated to version 0.149
Annotate Gene Fusion (AGFusion) is a package for annotating gene fusions from the human or mouse genomes.
21 Sep 2017 paraview updated to version 5.4.1
ParaView is an open-source, multi-platform data analysis and visualization application.
21 Sep 2017 cellranger updated to version 2.1.0
Cell Ranger is a set of analysis pipelines that processes Chromium single cell 3’ RNA-seq output to align reads, generate gene-cell matrices and perform clustering and gene expression analysis.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Location
12 Dec 2017Protein Data BankFasta/fdb/fastadb/pdb.nt.fas
12 Dec 2017NCBI ntFasta/fdb/fastadb/nt.fas
12 Dec 2017MitoFasta/fdb/fastadb/mito.nt.fas
12 Dec 2017MitoFasta/fdb/fastadb/mito.aa.fas
12 Dec 2017NCBI nrFasta/fdb/fastadb/nr.aa.fas
12 Dec 2017NCBI Taxonomytaxonomy/fdb/taxonomy
12 Dec 2017MitoBlast/fdb/blastdb/mito.aa
10 Dec 201716S MicrobialBlast/fdb/blastdb/16SMicrobial
08 Dec 2017Mouse Genome (Mus musculus) mm8MySQLNIH mirror of UCSC Genome Browser
05 Dec 2017ANNOVARANNOVAR/fdb/annovar/current
05 Dec 20171000 GenomesBAM/fdb/1000genomes/ftp/data/
05 Dec 2017SwissProtFasta/fdb/fastadb/swissprot.aa.fas
05 Dec 2017Protein Data BankFasta/fdb/fastadb/pdb.aa.fas
04 Dec 2017Protein Data BankBlast/fdb/blastdb/pdbaa
04 Dec 2017SwissProtBlast/fdb/blastdb/swissprot
04 Dec 2017Protein Data BankBlast/fdb/blastdb/pdbnt
03 Dec 2017NCBI ntBlast/fdb/blastdb/nt
02 Dec 2017EST - othersBlast/fdb/blastdb/est_others
01 Dec 2017Rat Genome (Rattus norvegicus) rn4MySQLNIH mirror of UCSC Genome Browser
01 Dec 2017Dog Genome (Canis familiaris)MySQLNIH mirror of UCSC genome browser
01 Dec 2017HTGsBlast/fdb/blastdb/htgs
28 Nov 2017Refseq Other GenomicFasta/fdb/fastadb/ref.other.genomic.fas
24 Nov 2017NCBI nrBlast/fdb/blastdb/nr
16 Nov 20171000 GenomesVCF/fdb/1000genomes/
15 Nov 2017Protein Data BankPDB/pdb/pdb
20 Oct 2017Drosophila genome (Drosophila melanogaster) fb5MySQLNIH mirror of UCSC genome browser