Biowulf High Performance Computing at the NIH
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
2 Jun 2020 xcpengine updated to version 1.2.1
xpcEngine performs denoising and estimation of Functional Connectivity on fMRI datasets
2 Jun 2020 rilseq updated to version 0.75
RILseq computational protocol
2 Jun 2020 PRSice updated to version 2.3.1
PRSice is a Polygenic Risk Score software for calculating, applying, evaluating and plotting the results of polygenic risk scores (PRS) analyses.
1 Jun 2020 fusioninspector updated to version 2.3.0
In silico Validation of Fusion Transcript Predictions
1 Jun 2020 HMMRATAC updated to version 1.2.10
HMMRATAC peak caller for ATAC-seq data
1 Jun 2020 Hail updated to version 0.2.43
Hail is an open-source, scalable framework for exploring and analyzing genomic data.
1 Jun 2020 stringtie updated to version 2.0.3b
StringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts. It is primarily a genome-guided transcriptome assembler, although it can borrow algorithmic techniques from de novo genome assembly to help with transcript assembly.
1 Jun 2020 vscode updated to version 1.45.1
Free source code editor with many utilities for Python, Julia and others.
1 Jun 2020 cutadapt updated to version 2.10
cutadapt removes adapter sequences from DNA high-throughput sequencing data. This is usually necessary when the read length of the machine is longer than the molecule that is sequenced, such as in microRNA data.
1 Jun 2020 nvchecker updated to version 1.6
nvchecker (short for new version checker) is for checking if a new version of some software has been released.
31 May 2020 nodejs updated to version 12.17.0
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. module name: nodejs
28 May 2020 mrtrix updated to version 3.0.0
MRtrix provides a large suite of tools for image processing, analysis and visualisation, with a focus on the analysis of white matter using diffusion-weighted MRI.
28 May 2020 VEP updated to version 100.2
VEP (Variant Effect Predictor) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.
28 May 2020 ChIPseeqer updated to version 2.1
ChIPseeqer is an integrative, comprehensive, fast and user-friendly computational framework for in-depth analysis of ChIP-seq datasets. It combinse several computational tools in order to create easily customized workflows that can be adapted to the user’s needs and objectives.
28 May 2020 TelomereHunter updated to version 1.1.0
TelomereHunter is a software for the detailed characterization of telomere maintenance mechanism footprints in the genome. The tool is implemented for the analysis of large cancer genome cohorts and provides a variety of diagnostic diagrams as well as machine-readable output for subsequent analysis.
28 May 2020 sratoolkit updated to version 2.10.7
The NCBI SRA Toolkit enables reading ("dumping") of sequencing files from the SRA database and writing ("loading") files into the .sra format.
26 May 2020 rscape updated to version 1.5.2
RNA Significant Covariation Above Phylogenetic Expectation is a program that given a multiple sequence alignment of RNA sequences
26 May 2020 boost updated to version 1.73
Boost provides free peer-reviewed portable C++ source libraries. Boost libraries are intended to be widely useful, and usable across a broad spectrum of applications.
22 May 2020 Freesurfer updated to version 7.1.0
Freesurfer is a set of automated tools for reconstruction of the brain's cortical surface from structural MRI data, and overlay of functional MRI data onto the reconstructed surface.
21 May 2020 EVcouplings updated to version 0.0.5
Helps to predict protein structure, function and mutations using evolutionary sequence covariation.
21 May 2020 intarna updated to version 3.2.0
IntaRNA is a program for the fast and accurate prediction of interactions between two RNA molecules.
21 May 2020 rnaview updated to version current
The RNAView program generates 2-dimensional displays of RNA/DNA secondary structures with tertiary interactions.
21 May 2020 rnastructure updated to version 6.2
RNAstructure is a complete package for RNA and DNA secondary structure prediction and analysis. It includes algorithms for secondary structure prediction, including facility to predict base pairing probabilities. It also can be used to predict bimolecular structures and can predict the equilibrium binding affinity of an oligonucleotide to a structured RNA target.
20 May 2020 vcflib updated to version 1.0.1
a simple C++ library for parsing and manipulating VCF files, + many command-line utilities
20 May 2020 slamdunk updated to version 0.4.3
SlamDunk is a novel, fully automated software tool for automated, robust, scalable and reproducible SLAMseq data analysis.
18 May 2020 PyCharm updated to version 2018.3.5
A Python IDE
18 May 2020 nanopolish updated to version 0.13.2
nanopolish is a software package for signal-level analysis of Oxford Nanopore sequencing data. Nanopolish can calculate an improved consensus sequence for a draft genome assembly, detect base modifications, call SNPs and indels with respect to a reference genome and more (see Nanopolish modules, below).
15 May 2020 gen3-client updated to version 2020.05
The gen3-client provides an easy-to-use, command-line interface for uploading and downloading data files to and from a Gen3 data commons from the terminal or command prompt.
14 May 2020 fusioncatcher updated to version 1.20
FusionCatcher searches for novel/known somatic fusion genes, translocations, and chimeras in RNA-seq data (paired-end or single-end reads from Illumina NGS platforms like Solexa/HiSeq/NextSeq/MiSeq) from diseased samples.
14 May 2020 samtools updated to version 1.10
The samtools package now provides samtools, bcftools, tabix, and the underlying htslib library.
14 May 2020 crystfel updated to version 0.9.0.5ae3043d
CrystFEL is a suite of programs for processing diffraction data acquired serially in a snapshot manner, such as when using the technique of Serial Femtosecond Crystallography (SFX) with a free-electron laser source.
14 May 2020 PartekFlow updated to version 9.0.20.0510
Web interface designed specifically for the analysis needs of next generation sequencing applications including RNA, small RNA, and DNA sequencing.
13 May 2020 tvb updated to version 1.5.8
The Virtual Brain (TVB) scientific library has the purpose of offering modern tools to the Neurosciences community, for computing, simulating and analyzing functional and structural data of human brains
13 May 2020 GATK updated to version 4.1.7.0
GATK, from the Broad Institute, is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.
13 May 2020 AMON updated to version 1.0.0
AMON (Annotation of Metabolite Origins via Networks) is an open-source bioinformatics application that can be used to (1) annotate which compounds in the metabolome could have been produced by bacteria present or the host; (2) evaluate the pathway enrichment of host verses microbial metabolites, and (3) to visualize which compounds may have been produced by host versus microbial enzymes in KEGG pathway maps.
11 May 2020 spm12 updated to version r7771
The (S)tatistical (P)ara(M)etric application analyzes brain imaging data.
11 May 2020 Rosetta updated to version 2020.11
The Rosetta++ software suite can perform de novo protein structure predictions, identify low free energy sequences for target protein backbones, predict the structure of a protein-protein complex from the individual structures of the monomer components, incorporate NMR data into the basic Rosetta protocol to accelerate the process of NMR structure prediction, and more...
8 May 2020 QTLtools updated to version 1.2
A tool set for molecular QTL discovery and analysis. It allows to go from the raw sequence data to collection of molecular Quantitative Trait Loci (QTLs) in few easy-to-perform steps.
6 May 2020 lafter updated to version 1.1
LAFTER is a local filter for single particle TEM reconstructions.
5 May 2020 Phenix updated to version 1.18-3855
PHENIX is a software suite for the automated determination of macromolecular structures using X-ray crystallography and other methods.
5 May 2020 MySQL updated to version 8.0.20
MySQL is an open-source relational database management system.
5 May 2020 MIPAV updated to version 10.0.0
The MIPAV (Medical Image Processing, Analysis, and Visualization) application enables quantitative analysis and visualization of medical images of numerous modalities such as PET, MRI, CT, or microscopy.
5 May 2020 m2clust updated to version 0.0.8
m2clust provides an elegant clustering approach to find clusters in data sets with different density and resolution.
4 May 2020 netpbm updated to version 10.86.12
Netpbm is a toolkit for manipulation of graphic images, including conversion of images between a variety of different formats. There are over 300 separate tools in the package including converters for about 100 graphics formats. Examples of the sort of image manipulation we're talking about are: Shrinking an image by 10%; Cutting the top half off of an image; Making a mirror image; Creating a sequence of images that fade from one image to another.
1 May 2020 taiji updated to version v1.2.0
The Taiji software is a versatile genomics data analysis pipeline. It can be used to analyze ATAC-seq, RNA-seq, single cell ATAC-seq and Drop-seq data.
1 May 2020 GEMMA updated to version 0.98.1
GEMMA is the software implementing the Genome-wide Efficient Mixed Model Association algorithm for a standard linear mixed model and some of its close relatives for genome-wide association studies (GWAS).
30 Apr 2020 bbtools updated to version 38.82
An extensive set of bioinformatics tools including bbmap (short read aligner), bbnorm (kmer based normalization), dedupe (deduplication and clustering of unaligned reads), reformat (formatting and trimming reads) and many more.
30 Apr 2020 angsd updated to version 0.930
ANGSD is a software for analyzing next generation sequencing data. The software can handle a number of different input types from mapped reads to imputed genotype probabilities. Most methods take genotype uncertainty into account instead of basing the analysis on called genotypes. This is especially useful for low and medium depth data. The software is written in C++ and has been used on large sample sizes.
29 Apr 2020 hyphy updated to version 2.5.11
HyPhy (Hypothesis Testing using Phylogenies) is an open-source software package for the analysis of genetic sequences (in particular the inference of natural selection) using techniques in phylogenetics, molecular evolution, and machine learning.
29 Apr 2020 abyss updated to version 2.2.4
Abyss represents Assembly By Short Sequences - a de novo, parallel, paired-end sequence assembler. The parallel version is implemented using MPI and is capable of assembling larger genomes.
29 Apr 2020 globus_sdk updated to version 1.9.0
Pythonic interface to Globus REST APIs, including the Transfer API and the Globus Auth API
29 Apr 2020 MotionCor2 updated to version 1.3.1
MotionCor2 is a multi-GPU accelerated program that provides iterative, patch-based motion detection combining spatial and temporal constraints and dose weighting for both single particle and tomographic cryo-electon microscopy images.
29 Apr 2020 mothur updated to version 1.44.1
mothur is a tool for analyzing 16S rRNA gene sequences generated on multiple platforms as part of microbial ecology projects.
28 Apr 2020 BRASS updated to version 6.3.1
BRASS analyses one or more related BAM files of paired-end sequencing to determine potential rearrangement breakpoints.
28 Apr 2020 cmdstan updated to version 2.23.0
Command line interface to stan
27 Apr 2020 diamond updated to version 0.9.32
DIAMOND is a new high-throughput program for aligning DNA reads or protein sequences against a protein reference database such as NR, at up to 20,000 times the speed of BLAST, with high sensitivity.
27 Apr 2020 infernal updated to version 1.1.3
Package for searching DNA sequence databases for RNA structure and sequence similarities
24 Apr 2020 chipseq_pipeline updated to version 1.4.0.1
AQUAS Transcription Factor and Histone ChIP-Seq processing pipeline. The AQUAS pipeline is based off the ENCODE (phase-3) transcription factor and histone ChIP-seq pipeline specifications (by Anshul Kundaje)
24 Apr 2020 Scramble updated to version 0.0.20190211.82c78b9
Scramble is a mobile element insertion (MEI) detection tool. It identifies clusters of soft clipped reads in a BAM file, builds consensus sequences, aligns to representative L1Ta, AluYa5, and SVA-E sequences, and outputs MEI calls.
23 Apr 2020 epic2 updated to version 0.0.41
Chip-Seq broad peak/domain finder based on SICER
22 Apr 2020 vireosnp updated to version 0.3.2
Demultiplexing pooled scRNA-seq data without genotype reference
22 Apr 2020 seqkit updated to version 0.12.1
A cross-platform toolkit for FASTA/Q file manipulation
22 Apr 2020 cellsnp updated to version 0.1.7
Pileup biallelic SNPs from single-cell and bulk RNA-seq data
22 Apr 2020 salmon updated to version 1.2.0
a tool for quantifying the expression of transcripts using RNA-seq data.
21 Apr 2020 igblast updated to version 1.16.0
IgBlast is a sequence analysis tool for immunoglobulin variable domains.
21 Apr 2020 IGVTools updated to version 2.8.2
IGVTools provides utilities for working with ascii file formats used by the Integrated Genome Viewer. The files can be sorted, tiled, indexed, and counted.
21 Apr 2020 sickle updated to version 1.33
A windowed adaptive trimming tool for FASTQ files using quality
21 Apr 2020 git updated to version 2.26.2
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
20 Apr 2020 vcf2maf updated to version 1.6.18
A smarter, more reproducible, and more configurable tool for converting a VCF to a MAF.
17 Apr 2020 kaiju updated to version 1.7.3
Kaiju is a program for the taxonomic classification of high-throughput sequencing reads, e.g., Illumina or Roche/454, from whole-genome sequencing of metagenomic DNA.
16 Apr 2020 deepvariant updated to version 0.10.0
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
16 Apr 2020 rmblast updated to version 2.10.0
RMBlast is a RepeatMasker-compatible version of the standard NCBI blastn program. RMBlast supports RepeatMasker searches by adding a few necessary features to the stock NCBI blastn program.
16 Apr 2020 IGV updated to version 2.8.2
The Integrative Genomics Viewer is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets.
16 Apr 2020 mixcr updated to version 3.0.13
MiXCR is a universal software for fast and accurate analysis of T- and B- cell receptor repertoire sequencing data.
16 Apr 2020 medaka updated to version 0.12.1
medaka is a tool to create a consensus sequence from nanopore sequencing data. This task is performed using neural networks applied from a pileup of individual sequencing reads against a draft assembly.
15 Apr 2020 google-cloud-sdk updated to version 289.0.0
Google Cloud SDK is a set of tools that you can use to manage resources and applications hosted on Google Cloud Platform. These include the gcloud, gsutil, and bq command line tools. See docs at https://cloud.google.com/sdk/docs/how-to.
Type 'module load google-cloud-sdk' to use on Biowulf.
14 Apr 2020 fastq_screen updated to version 0.14.0
FastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.
14 Apr 2020 csvkit updated to version 1.0.5
csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.
14 Apr 2020 TaxonKit updated to version 0.5.0
A Cross-platform and Efficient NCBI Taxonomy Toolkit
14 Apr 2020 tantan updated to version 22
A tool to mask low complexity and short period tandem repeats
14 Apr 2020 Genome Browser updated to version 396
The Genome Browser Mirror Fragments is a mirror of the UCSC Genome Browser. The URL is https://hpcnihapps.cit.nih.gov/genome. Users can also access the MySQL databases, supporting files directly, and a huge number of associated executables.
14 Apr 2020 Schrodinger updated to version 2020.1
A limited number of Schrödinger applications are available on the Biowulf cluster through the Molecular Modeling Interest Group. Most are available through the Maestro GUI.
13 Apr 2020 novocraft updated to version 4.02.02
Package includes aligner for single-ended and paired-end reads from the Illumina Genome Analyser. Novoalign finds global optimum alignments using full Needleman-Wunsch algorithm with affine gap penalties.
13 Apr 2020 tandemtools updated to version current
Tool for assessing/improving assembly quality in extra-long tandem repeats
13 Apr 2020 parabricks updated to version 2.5.0
The Parabricks Genomics Analysis Toolkit provides GPU-accelerated genomic analysis ( GPU-accelerated GATK)
10 Apr 2020 deepmedic updated to version 0.8.0
This project aims to offer easy access to Deep Learning for segmentation of structures of interest in biomedical 3D scans. It is a system that allows the easy creation of a 3D Convolutional Neural Network, which can be trained to detect and segment structures if corresponding ground truth labels are provided for training. The system processes NIFTI images, making its use straightforward for many biomedical tasks.
10 Apr 2020 tetoolkit updated to version 2.1.4
A package for including transposable elements in differential enrichment analysis of sequencing datasets.
10 Apr 2020 LocScale updated to version 0.1
LocScale is a reference-based local amplitude scaling tool using prior model information to improve contrast of cryo-EM density maps. It can be helpful in the common case of resolution variation in the 3D reconstruction and it can be used as an alternative to other commonly applied map sharpening methods.
9 Apr 2020 macs updated to version 2.2.6
Model-based Analysis of ChIP-Seq (MACS) on short reads sequencers such as Genome Analyzer (Illumina / Solexa). MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction.
9 Apr 2020 bedops updated to version 2.4.39
Bedops is a suite of tools to address common questions raised in genomic studies - mostly with regard to overlap and proximity relationships between data sets - BEDOPS aims to be scalable and flexible, facilitating the efficient and accurate analysis and management of large-scale genomic data.
9 Apr 2020 homer updated to version 4.11.1
HOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and ChIP-Seq analysis.
9 Apr 2020 interproscan updated to version 5.42-78.0
InterProScan is the software package that allows sequences (protein and nucleic) to be scanned against InterPro's signatures. Signatures are predictive models, provided by several different databases, that make up the InterPro consortium.
8 Apr 2020 flappie updated to version 2.1.3
Basecall Fast5 reads using flip-flop basecalling.
8 Apr 2020 PhIP-Stat updated to version 0.3.0
PhIP-Stat is a set of analysis tools for tools for PhIP-seq experiments. It allows for processing of PhIP-Seq raw data.
8 Apr 2020 deeptools updated to version 3.4.2
deepTools is a suite of user-friendly tools for the visualization, quality control and normalization of data from deep-sequencing DNA sequencing experiments.
8 Apr 2020 prokka updated to version 1.14.6
Prokka is a software tool for the rapid annotation of prokaryotic genomes.
7 Apr 2020 MonoVar updated to version 20200403
Monovar is a statistical method for detecting and genotyping single-nucleotide variants in single-cell data. It takes onto account an allelic dropout, false-positive errors and a nonuniformity of coverage.
6 Apr 2020 mriqc updated to version 0.15.2
MRIQC is an MRI quality control tool
3 Apr 2020 roary updated to version 3.13.0
Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka) and calculates the pan genome.
3 Apr 2020 hicpro updated to version 2.11.4
HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
3 Apr 2020 delly updated to version 0.8.3
DELLY is an integrated structural variant prediction method that can detect deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends and split-reads to sensitively and accurately delineate genomic rearrangements throughout the genome.
2 Apr 2020 ldsc updated to version 3d0c4464
ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
2 Apr 2020 fastp updated to version 0.20.1
A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.
31 Mar 2020 lefse updated to version 1.0.8
LEfSe (Linear discriminant analysis Effect Size) determines the features (organisms, clades, operational taxonomic units, genes, or functions) most likely to explain differences between classes by coupling standard tests for statistical significance with additional tests encoding biological consistency and effect relevance.
31 Mar 2020 snakemake updated to version 5.13.0
Snakemake aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style. It is well suited for bioinformatic workflows.
31 Mar 2020 Mathematica updated to version 12.1
Mathematica is an interactive system for doing mathematical computation. It performs numerical, symbolic and graphical computations, and incorporates a high-level programming language.
31 Mar 2020 Matlab updated to version 2020a
MATLAB is an interactive software package for scientific and engineering numeric computation. MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an environment where problems and solutions are expressed just as they are written mathematically.
31 Mar 2020 cpdf updated to version 2.3.1
Coherent PDF tools
31 Mar 2020 pandoc updated to version 2.9.2.1
Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.
31 Mar 2020 picard updated to version 2.22.2
Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.
30 Mar 2020 R updated to version 3.6.3
R (the R Project) is a language and environment for statistical computing and graphics. R is similar to S, and provides a wide variety of statistical and graphical techniques (linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, ...).
30 Mar 2020 breseq updated to version 0.35.1
breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data. It is intended for haploid microbial genomes (<20 Mb).
30 Mar 2020 htseq updated to version 0.11.4
HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays.
30 Mar 2020 humann2 updated to version 2.8.1
HUMAnN is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads).
27 Mar 2020 screen updated to version 4.01
Screen is a full-screen window manager that multiplexes a physical terminal between several processes, typically interactive shells.
27 Mar 2020 vg updated to version 1.21.0
Tools for working with genome variation graphs
27 Mar 2020 ricopili updated to version 2019_Jun_25.001
RICOPILI stands for Rapid Imputation and COmputational PIpeLIne for GWAS.
27 Mar 2020 peddy updated to version 0.4.6
peddy is used to compare sex and familial relationships given in a PED file with those inferred from a VCF file
26 Mar 2020 pychopper updated to version 2.3.1
Pychopper v2 is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.
26 Mar 2020 binlorry updated to version 1.3.1
BinLorry is a tool for binning and filtering sequencing reads into distinct files. Reads can be binned and filtered by any attributes encoded in their headers, documented in a CSV file or by length.
26 Mar 2020 cellxgene updated to version 0.15.0
cellxgene (pronounced "cell-by-gene") is an interactive data explorer for single-cell transcriptomics datasets, such as those coming from the Human Cell Atlas.
26 Mar 2020 MAJIQ updated to version 2.1-patched
Modeling Alternative Junction Inclusion Quantification. MAJIQ and Voila are two software packages that together define, quantify, and visualize local splicing variations (LSV) from RNA-Seq data.
26 Mar 2020 globus-cli updated to version 1.12.0
Globus command line interface
26 Mar 2020 golang updated to version 1.14.1
The Go programming language
26 Mar 2020 nextflow updated to version 20.01.0
Data-driven computational pipelines
26 Mar 2020 Julia updated to version 1.4.0
high level, dynamic language for technical computing
25 Mar 2020 gvcfgenotyper updated to version 2019.02.26
A utility for merging and genotyping Illumina-style GVCFs.
24 Mar 2020 bali-phy updated to version 3.5
BAli-Phy is MCMC software developed by Ben Redelings with Marc Suchard for simultaneous Bayesian estimation of alignment and phylogeny (and other parameters). It handles generic Bayesian modeling via probabilistic programming.
24 Mar 2020 rseqc updated to version 3.0.1
Rseqc comprehensively evaluate RNA-seq datasets generated from clinical tissues or other well annotated organisms such as mouse, fly and yeast.
24 Mar 2020 fastqc updated to version 0.11.9
It provide quality control functions to next gen sequencing data.
24 Mar 2020 subread updated to version 2.0.0
High-performance read alignment, quantification and mutation discovery
24 Mar 2020 VarScan updated to version 2.4.3
A platform-independent, technology-independent software tool for identifying SNPs and indels in massively parallel sequencing of individual and pooled samples.
24 Mar 2020 viennarna updated to version 2.4.14
RNA Secondary Structure Prediction and Comparison
23 Mar 2020 CSD updated to version 2020
The Cambridge Structural Database is the world repository of small molecule crystal structures.
23 Mar 2020 fmriprep updated to version 20.0.5
A Robust Preprocessing Pipeline for fMRI Data
20 Mar 2020 flye updated to version 2.7
Fast and accurate de novo assembler for single molecule sequencing reads
20 Mar 2020 bowtie2 updated to version 2.4.1
A version of bowtie that's particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes
20 Mar 2020 MCL updated to version 14-137
MCL implements Markov cluster algorithm. Among its applications is the assignment of proteins into families based on precomputed sequence similarity information. This approach does not suffer from the problems that normally hinder other protein sequence clustering algorithms, such as the presence of multi-domain proteins, promiscuous domains and fragmented proteins.
20 Mar 2020 SOAPdenovo-Trans updated to version 1.04
SOAPdenovo-Trans is a de novo transcriptome assembler designed specifically for RNA-Seq. Its performance on transcriptome datasets from rice and mouse. It provides higher contiguity, lower redundancy and faster execution than other popular transcriptome assemblers.
20 Mar 2020 kallisto updated to version 0.46.2
kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.
19 Mar 2020 LongRanger updated to version 2.2.2
Long Ranger is a set of analysis pipelines that processes GemCode sequencing output to align reads and call and phase SNPs, indels, and structural variants Loupe is a genome browser designed to visualize the Linked-Read data produced by the 10x Chromium Platform.
19 Mar 2020 genometools updated to version 1.6.1
collection of bioinformatic tools
19 Mar 2020 shapeit updated to version 4.1.3
SHAPEIT is a fast and accurate haplotype inference software
19 Mar 2020 QIIME updated to version 2-2020.2
QIIME is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rRNA) generated on a variety of platforms, but also supporting analysis of other types of data (such as shotgun metagenomic data).
19 Mar 2020 OpenSlide updated to version 3.4.1
OpenSlide is a C library for reading and manipulating digital slides of diverse vendor formats. It provides a simple interface to read whole-slide images (also known as virtual slides). OpenSlide has been used in the digital pathology projects.
19 Mar 2020 BEAST updated to version 1.10.4,2.6.2
BEAST (Bayesian Evolutionary Analysis Sampling Trees) is a cross-platform program for Bayesian MCMC analysis of molecular sequences.
19 Mar 2020 Huygens updated to version 19.10
Huygens is an image restoration, deconvolution, resolution and noise reduction. It can process images from all current optical microscopes, including wide-field, confocal, Nipkow (scanning disk confocal), multiple-photon, and 4Pi microscopes.
18 Mar 2020 Canu updated to version 2.0
Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION). Canu will correct the reads, then trim suspicious regions (such as remaining SMRTbell adapter), then assemble the corrected and cleaned reads into unitigs.
18 Mar 2020 Intel Compiler Suite updated to version 2019.4.243
Intel Compiler Suite for Linux. Includes C/C++ and Fortran compilers. Also includes the Math Kernel Library, Integrated Performance Primitives and Thread Building Blocks.
18 Mar 2020 ChromHMM updated to version 1.20
ChromHMM is software for learning and characterizing chromatin states.
13 Mar 2020 PyMOL updated to version 2.3.0
A comprehensive molecular visualization product for rendering and animating 3D molecular structures.
13 Mar 2020 OpenBabel updated to version 3.0.0
Open Babel is a chemical toolbox designed to speak the many languages of chemical data.
10 Mar 2020 HTGTSrep updated to version 9fe74ff
A pipeline for comprehensive analysis of HTGTS-Rep-seq.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Location
02 Jun 2020NCBI Taxonomytaxonomy/fdb/taxonomy
31 May 2020BetacoronavirusBlast/fdb/blastdb/Betacoronavirus
26 May 2020NCBI ntBlast/fdb/blastdb/nt
26 May 2020NCBI nrBlast/fdb/blastdb/nr
26 May 2020Protein Data BankBlast/fdb/blastdb/pdbaa
26 May 2020SwissProtBlast/fdb/blastdb/swissprot
20 May 2020Human Genome hg19Fasta/fdb/genome/human-feb2009/
08 May 2020COSMICVCF/fdb/COSMIC
05 May 2020ANNOVARANNOVAR/fdb/annovar/current
19 Apr 2020Rat Genome (Rattus norvegicus) rn4MySQLNIH mirror of UCSC Genome Browser
30 Mar 2020Cambridge Structural DatabaseCSD/usr/local/apps/CSD