High-Performance Computing at the NIH
GitHub YouTube @nih_hpc RSS Feed
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
25 Sep 2018 PEER updated to version 1.3
PEER stands for probabilistic estimation of expression residuals. It is a collection of Bayesian approaches to infer hidden determinants and their effects from gene expression profiles using factor analysis methods.
25 Sep 2018 cutadapt updated to version 1.18
cutadapt removes adapter sequences from DNA high-throughput sequencing data. This is usually necessary when the read length of the machine is longer than the molecule that is sequenced, such as in microRNA data.
24 Sep 2018 albacore updated to version 2.3.3
ONT basecaller
24 Sep 2018 drompa updated to version 3.5.0
Peak-calling, Visualization, Normalization and QC for ChIP-seq analysis
24 Sep 2018 flashpca updated to version 2.0
FlashPCA performs fast principal component analysis (PCA) of single nucleotide polymorphism (SNP) data, similar to smartpca from EIGENSOFT (http://www.hsph.harvard.edu/alkes-price/software/) and shellfish (https://github.com/dandavison/shellfish). FlashPCA is based on the https://github.com/yixuan/spectra/ library.
24 Sep 2018 rmats updated to version 4.0.2
MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data.
20 Sep 2018 PRSice updated to version 2.1.3.beta
PRSice is a Polygenic Risk Score software for calculating, applying, evaluating and plotting the results of polygenic risk scores (PRS) analyses.
19 Sep 2018 encode-atac-seq-pipeline updated to version 1.0
This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq or DNase-seq data.
18 Sep 2018 cromwell updated to version 34
A Workflow Management System geared towards scientific workflows.
18 Sep 2018 Eagle updated to version 2.4
Eagle performs a reference-based haplotype phasing. It attains high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels (such as the Haplotype Reference onsortium; HRC) using a new data structure based on the positional Burrows-Wheeler transform.
18 Sep 2018 mriqc updated to version 0.14.2
MRIQC is an MRI quality control tool
17 Sep 2018 Julia updated to version 1.0.0
high level, dynamic language for technical computing
13 Sep 2018 fastqc updated to version 0.11.6
It provide quality control functions to next gen sequencing data.
13 Sep 2018 AnnotSV updated to version 1.1.1
AnnotSV is a program designed for annotating Structural Variations (SV). This tool compiles functionally, regulatory and clinically relevant information and aims at providing annotations useful to i) interpret SV potential pathogenicity and ii) filter out SV potential false positives.
12 Sep 2018 nodejs updated to version 8.12.0
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. module name: nodejs
7 Sep 2018 Canvas updated to version 1.38
Canvas is a tool for calling copy number variants (CNVs) from human DNA sequencing data.
7 Sep 2018 rockhopper updated to version 2.0.3
Rockhopper is a comprehensive and user-friendly system for computational analysis of bacterial RNA-seq data. As input, Rockhopper takes RNA sequencing reads output by high-throughput sequencing technology (FASTQ, QSEQ, FASTA, SAM, or BAM files)
7 Sep 2018 HLA-PRG-LA updated to version 0.85.45c4fea
Stands for HLA PRG, linear approximation. The basic idea is to seed graph alignments with linear alignments to the sequences that the graph consists of.
7 Sep 2018 RapMap updated to version 0.5.0
RapMap is a tool for rapid sensitive and accurate read mapping via quasi-mapping. It is capable of mapping sequencing reads to a target transcriptome substantially faster than existing alignment tools.
5 Sep 2018 eager updated to version 1.92
EAGER: efficient ancient genome reconstruction
5 Sep 2018 h5utils updated to version 1.13.1
h5utils is a set of utilities for visualization and conversion of scientific data in the free, portable HDF5 format. Type 'module load h5utils' to access the executables (e.g. h5topng)
5 Sep 2018 dotnet-sdk updated to version 2.1.301
Microsoft .NET SDK and runtime
29 Aug 2018 U-Net updated to version 20180704
U-Net is an image segmentation tool. It relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization.
29 Aug 2018 DeepLab updated to version 20180816
DeepLab is a Semantic Image Segmentation tool. It makes use of the Deep Convolutional Networks, Atrous (or Dilated) Convolution, and Fully Connected Conditional Random Fields.
28 Aug 2018 DanQ updated to version 20180828
DanQ is a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
24 Aug 2018 DNAnexus updated to version 0.260.0
DNAnexus is a cloud-based commercial solution for next-generation sequence analysis and visualization. It has a command-line interface (CLI) which can be used to log in to the DNAnexus platform, upload and navigate data, and launch analyses.
23 Aug 2018 cmake updated to version 3.12.1
CMake is a family of tools designed to build, test and package software.
20 Aug 2018 MALDER updated to version 1.0
MALDER is a Modified version of ALDER that has been modified to allow multiple admixture events. ALDER computes the weighted linkage disequilibrium (LD) statistic for making inference about population admixture described in: Loh P-R, Lipson M, Patterson N, Moorjani P, Pickrell JK, Reich D, and Berger B. Inferring Admixture Histories of Human Populations Using Linkage Disequilibrium. Genetics, 2013.
20 Aug 2018 STREAM updated to version 20180816
STREAM stands for Single-cell Trajectories Reconstruction, Exploration And Mapping ofomics data. It is an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data.
15 Aug 2018 blobtools updated to version 1.0
A modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets.
14 Aug 2018 WISExome updated to version 20180814
WISExome is the tool that implements a within-sample comparison approach to CNV detection. It correctly identifies known pathogenic CNVs.
14 Aug 2018 spruce updated to version 20180606
SPRUCE (Somatic Phylogeny Reconstruction using Combinatorial Enumeration) is an algorithm for inferring the clonal evolution of single-nucleotide and copy-number variants given multi-sample bulk tumor sequencing data.
14 Aug 2018 minimac updated to version 4 (1.0.1)
minimac is a low memory, computationally efficient implementation of the MaCH algorithm for genotype imputation. It is designed to work on phased genotypes and can handle very large reference panels with hundreds or thousands of haplotypes. 'mini' refers to the low amount of computational resources it needs.
14 Aug 2018 mtoolbox updated to version 1.1
A bioinformatics pipeline aimed at the analysis of mitochondrial DNA (mtDNA) in high throughput sequencing studies.
13 Aug 2018 mango updated to version 4.0.1
Mango (Multi-image Analysis GUI) is a viewer for medical research images. It provides analysis tools and a user interface to navigate image volumes.
13 Aug 2018 fmriprep updated to version 1.1.4
A Robust Preprocessing Pipeline for fMRI Data
10 Aug 2018 vsearch updated to version 2.8.1
VSEARCH supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
8 Aug 2018 cellranger updated to version 2.2.0
Cell Ranger is a set of analysis pipelines that processes Chromium single cell 3’ RNA-seq output to align reads, generate gene-cell matrices and perform clustering and gene expression analysis.
8 Aug 2018 breseq updated to version 0.33.0
breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data. It is intended for haploid microbial genomes (<20 Mb).
6 Aug 2018 king updated to version 2.1.4
Quick Links Documentation Notes Interactive job Batch job Swarm of jobs KING is a toolset to explore genotype data from a genome-wide association study (GWAS) or a sequencing project. KING can be used to check family relationship and flag pedigree errors by estimating kinship coefficients and inferring IBD segments for all pairwise relationships.
6 Aug 2018 genomestrip updated to version 2.00.1833
Genome STRiP (Genome STRucture In Populations) is a suite of tools for discovering and genotyping structural variations using sequencing data. The methods are designed to detect shared variation using data from multiple individuals.
6 Aug 2018 VCF-kit updated to version 0.1.6
VCF-kit is a collection of utility tools for processing and analyzing the VCF (variant call format) files, including primer generation for variant validation, dendrogram production,genotype imputation from sequence data in linkage studies, and additional tools to be used by statistical and population geneticists.
3 Aug 2018 usearch updated to version 11.0.667
USEARCH is a unique sequence analysis tool with thousands of users world-wide. USEARCH offers search and clustering algorithms that are often orders of magnitude faster than BLAST.
2 Aug 2018 VEP updated to version 93
VEP (Variant Effect Predictor) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.
1 Aug 2018 samtools updated to version 1.9
The samtools package now provides samtools, bcftools, tabix, and the underlying htslib library.
31 Jul 2018 SAMsrcV3 updated to version 20180713-c5e1042
Synthetic Aperture Magnetometry - The SANsrcV3 suite implements the latest advances in MEG source localization.
27 Jul 2018 Blast updated to version 2.8.0+alpha
NCBI's famous sequence database searching program which compares a nucleotide or protein query sequence against all sequences in a database.
27 Jul 2018 IMOD updated to version 4.10.10
IMOD is a set of image processing, modeling and display programs used for tomographic reconstruction and for 3D reconstruction of EM serial sections and optical sections.
25 Jul 2018 CTF updated to version 6.1.14-beta
The CTF MEG software has two main roles: - Provide a human-machine interface to the CTF MEG elec- tronics to collect MEG and/or EEG data. - Provide a tool for reviewing and (to a limited extent) ana- lyzing the MEG and/or EEG data acquired by the CTF MEG system.
25 Jul 2018 Few-Shot-ssl updated to version 20180723
Few-Shot semi-supervised learning (few-short-ssl) package implements learning algorithms that specifically allow for better generalization on problems with small labeled training sets.
25 Jul 2018 xenome updated to version 1.0.1
xenome is a tool for classifying reads from xenograft source.
25 Jul 2018 Accurity updated to version 20180724
Accurity is a tool for inference of tumor purity, tumor cell ploidy and absolute allelic copy numbers from tumor-normal WGS data.
24 Jul 2018 Solar updated to version 8.4.1
SOLAR is a program for multipoint, oligogenic, variance component linkage analysis in pedigrees of arbitrary size and complexity (Almasy L; Blangero J, 1998).
19 Jul 2018 bamcmp updated to version 20180719
bamcmp is a tool for deconvolving host and graft reads. It allows an accurate identification of the contaminating host reads when analyzing DNA-Seq and RNA-Seq data from patient-derived xenograft and circulating tumor cell–derived explant models.
19 Jul 2018 homer updated to version 4.10.1
HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and ChIP-Seq analysis.
19 Jul 2018 salmon updated to version 0.11.0
a tool for quantifying the expression of transcripts using RNA-seq data.
19 Jul 2018 purge_haplotigs updated to version 0~20180710.f4fd019
purge_haplotigs is a pipeline to help with curating heterozygous diploid genome assemblies.
19 Jul 2018 c3d updated to version 1.1.0
C3D is a command-line tool for converting 3D images between common file formats. The tool also includes a growing list of commands for image manipulation, such as thresholding and resampling.
19 Jul 2018 PolyRNN++ updated to version 20180718
Manually labeling datasets with object masks is extremely time consuming. PolyRNN++ produces polygonal annotations of objects interactively using humans-in-the-loop. It employs Convolutional Neural Network encoder trained with Reinforcement Learning.
18 Jul 2018 svclone updated to version 0.2.2-13-ge402c3f
A computational method for inferring the cancer cell fraction of tumour structural variation from whole-genome sequencing data.
17 Jul 2018 DEXTR-PyTorch updated to version 20180710
DEXTR-PyTorch implements a new approach (Deep Extreme Cut) to image labeling where extreme points in an object (left-most, right-most, top, bottom pixels) are used as input to obtain precise object segmentation for images and videos. This is done by adding an extra channel to the image in the input of a convolutional neural network (CNN), which contains a Gaussian centered in each of the extreme points. The CNN learns to transform this information into a segmentation of an object that matches those extreme points.
16 Jul 2018 Connectome Workbench updated to version 1.3.1
Tools to browse, download, explore, and analyze data from the Human Connectome Project (HCP). Allows users to compare their own data to that of the HCP.
10 Jul 2018 gnuplot updated to version 5.2.2
Gnuplot is a portable command-line driven graphing utility to visualize mathematical functions and data interactively, and can support many non-interactive uses such as web scripting.
Type 'gnuplot' to run, or 'module avail gnuplot' to see other available versions.
10 Jul 2018 scanpy updated to version 1.2.2
Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.
10 Jul 2018 boost updated to version 1.67
Boost provides free peer-reviewed portable C++ source libraries. Boost libraries are intended to be widely useful, and usable across a broad spectrum of applications.
10 Jul 2018 mothur updated to version 1.40.5
mothur is a tool for analyzing 16S rRNA gene sequences generated on multiple platforms as part of microbial ecology projects.
6 Jul 2018 cmtk updated to version 3.3.1
CMTK is a Software toolkit for computational morphometry of biomedical images. CMTK provides a set of command line tools for processing and I/O.
5 Jul 2018 BRASS updated to version 6.1.2
BRASS analyses one or more related BAM files of paired-end sequencing to determine potential rearrangement breakpoints.
5 Jul 2018 BEAST updated to version 1.10.0,2.4.7
BEAST (Bayesian Evolutionary Analysis Sampling Trees) is a cross-platform program for Bayesian MCMC analysis of molecular sequences.
5 Jul 2018 jo updated to version 1.1
A small utility to create JSON objects from command line arguments.
3 Jul 2018 tailseeker updated to version 3.1.7-6-g34b5ba9
Tailseeker is the official pipeline for TAIL-seq, which measures poly(A) tail lengths and 3′-end modifications with Illumina SBS sequencers.
3 Jul 2018 singularity updated to version 2.5.2
Singularity is a container platform focused on supporting ``Mobility of Compute``. It allows users to emulate, and share custom Linux environments allowing for the creation of self-contained development stacks.
29 Jun 2018 seqoutbias updated to version 1.1.3
Correct aligned HTS read counts for enzyme bias and mappability.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Location
25 Sep 2018Protein Data BankFasta/fdb/fastadb/pdb.nt.fas
25 Sep 2018NCBI ntFasta/fdb/fastadb/nt.fas
25 Sep 2018MitoFasta/fdb/fastadb/mito.nt.fas
25 Sep 2018SwissProtFasta/fdb/fastadb/swissprot.aa.fas
25 Sep 2018Protein Data BankFasta/fdb/fastadb/pdb.aa.fas
25 Sep 2018MitoFasta/fdb/fastadb/mito.aa.fas
25 Sep 2018NCBI nrFasta/fdb/fastadb/nr.aa.fas
25 Sep 2018MitoBlast/fdb/blastdb/mito.aa
25 Sep 2018NCBI Taxonomytaxonomy/fdb/taxonomy
24 Sep 2018ViralBlast/fdb/blastdb/viral
24 Sep 2018Protein Data BankBlast/fdb/blastdb/pdbaa
24 Sep 2018SwissProtBlast/fdb/blastdb/swissprot
24 Sep 2018Protein Data BankBlast/fdb/blastdb/pdbnt
23 Sep 201816S MicrobialBlast/fdb/blastdb/16SMicrobial
22 Sep 2018Protein Data BankPDB/pdb/pdb
19 Sep 2018ViralBlast/fdb/blastdb/viral
16 Sep 2018Rat Genome (Rattus norvegicus) rn4MySQLNIH mirror of UCSC Genome Browser
12 Sep 2018EST - othersBlast/fdb/blastdb/est_others
10 Sep 2018HTGsBlast/fdb/blastdb/htgs
07 Sep 2018Mouse Genome (Mus musculus) mm8MySQLNIH mirror of UCSC Genome Browser
03 Sep 2018NCBI nrBlast/fdb/blastdb/nr
31 Aug 2018Human Genome hg18MySQLNIH mirror of UCSC Genome Browser
08 Aug 2018NCBI ntBlast/fdb/blastdb/nt
31 Jul 2018ANNOVARANNOVAR/fdb/annovar/current
31 Jul 2018PFAMPFAM/fdb/fastadb/pfam
24 Jul 2018Refseq Other GenomicFasta/fdb/fastadb/ref.other.genomic.fas