Biowulf High Performance Computing at the NIH
Application updates in the last 3 months
To see all versions available for any application, use module avail application_name
All centrally-installed applications are listed on the Applications page
Updated Application
31 Mar 2020 lefse updated to version 1.0.8
LEfSe (Linear discriminant analysis Effect Size) determines the features (organisms, clades, operational taxonomic units, genes, or functions) most likely to explain differences between classes by coupling standard tests for statistical significance with additional tests encoding biological consistency and effect relevance.
31 Mar 2020 snakemake updated to version 5.13.0
Snakemake aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style. It is well suited for bioinformatic workflows.
31 Mar 2020 salmon updated to version 1.1.0
a tool for quantifying the expression of transcripts using RNA-seq data.
31 Mar 2020 Mathematica updated to version 12.1
Mathematica is an interactive system for doing mathematical computation. It performs numerical, symbolic and graphical computations, and incorporates a high-level programming language.
31 Mar 2020 Matlab updated to version 2020a
MATLAB is an interactive software package for scientific and engineering numeric computation. MATLAB integrates numerical analysis, matrix computation, signal processing, and graphics in an environment where problems and solutions are expressed just as they are written mathematically.
31 Mar 2020 cpdf updated to version 2.3.1
Coherent PDF tools
31 Mar 2020 pandoc updated to version 2.9.2.1
Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.
31 Mar 2020 picard updated to version 2.22.2
Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.
30 Mar 2020 R updated to version 3.6.3
R (the R Project) is a language and environment for statistical computing and graphics. R is similar to S, and provides a wide variety of statistical and graphical techniques (linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, ...).
30 Mar 2020 breseq updated to version 0.35.1
breseq is a computational pipeline for finding mutations relative to a reference sequence in short-read DNA re-sequencing data. It is intended for haploid microbial genomes (<20 Mb).
30 Mar 2020 htseq updated to version 0.11.4
HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays.
30 Mar 2020 humann2 updated to version 2.8.1
HUMAnN is a pipeline for efficiently and accurately profiling the presence/absence and abundance of microbial pathways in a community from metagenomic or metatranscriptomic sequencing data (typically millions of short DNA/RNA reads).
27 Mar 2020 screen updated to version 4.01
Screen is a full-screen window manager that multiplexes a physical terminal between several processes, typically interactive shells.
27 Mar 2020 nanopolish updated to version 0.12.5
nanopolish is a software package for signal-level analysis of Oxford Nanopore sequencing data. Nanopolish can calculate an improved consensus sequence for a draft genome assembly, detect base modifications, call SNPs and indels with respect to a reference genome and more (see Nanopolish modules, below).
27 Mar 2020 vg updated to version 1.21.0
Tools for working with genome variation graphs
27 Mar 2020 ricopili updated to version 2019_Jun_25.001
RICOPILI stands for Rapid Imputation and COmputational PIpeLIne for GWAS.
27 Mar 2020 peddy updated to version 0.4.6
peddy is used to compare sex and familial relationships given in a PED file with those inferred from a VCF file
26 Mar 2020 pychopper updated to version 2.3.1
Pychopper v2 is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.
26 Mar 2020 binlorry updated to version 1.3.1
BinLorry is a tool for binning and filtering sequencing reads into distinct files. Reads can be binned and filtered by any attributes encoded in their headers, documented in a CSV file or by length.
26 Mar 2020 cellxgene updated to version 0.15.0
cellxgene (pronounced "cell-by-gene") is an interactive data explorer for single-cell transcriptomics datasets, such as those coming from the Human Cell Atlas.
26 Mar 2020 MAJIQ updated to version 2.1-patched
Modeling Alternative Junction Inclusion Quantification. MAJIQ and Voila are two software packages that together define, quantify, and visualize local splicing variations (LSV) from RNA-Seq data.
26 Mar 2020 cutadapt updated to version 2.9
cutadapt removes adapter sequences from DNA high-throughput sequencing data. This is usually necessary when the read length of the machine is longer than the molecule that is sequenced, such as in microRNA data.
26 Mar 2020 GATK updated to version 4.1.6.0
GATK, from the Broad Institute, is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner.
26 Mar 2020 mothur updated to version 1.44.0
mothur is a tool for analyzing 16S rRNA gene sequences generated on multiple platforms as part of microbial ecology projects.
26 Mar 2020 globus-cli updated to version 1.12.0
Globus command line interface
26 Mar 2020 golang updated to version 1.14.1
The Go programming language
26 Mar 2020 nextflow updated to version 20.01.0
Data-driven computational pipelines
26 Mar 2020 Julia updated to version 1.4.0
high level, dynamic language for technical computing
25 Mar 2020 gvcfgenotyper updated to version 2019.02.26
A utility for merging and genotyping Illumina-style GVCFs.
24 Mar 2020 bali-phy updated to version 3.5
BAli-Phy is MCMC software developed by Ben Redelings with Marc Suchard for simultaneous Bayesian estimation of alignment and phylogeny (and other parameters). It handles generic Bayesian modeling via probabilistic programming.
24 Mar 2020 rseqc updated to version 3.0.1
Rseqc comprehensively evaluate RNA-seq datasets generated from clinical tissues or other well annotated organisms such as mouse, fly and yeast.
24 Mar 2020 fastqc updated to version 0.11.9
It provide quality control functions to next gen sequencing data.
24 Mar 2020 subread updated to version 2.0.0
High-performance read alignment, quantification and mutation discovery
24 Mar 2020 VarScan updated to version 2.4.3
A platform-independent, technology-independent software tool for identifying SNPs and indels in massively parallel sequencing of individual and pooled samples.
24 Mar 2020 viennarna updated to version 2.4.14
RNA Secondary Structure Prediction and Comparison
23 Mar 2020 CSD updated to version 2020
The Cambridge Structural Database is the world repository of small molecule crystal structures.
23 Mar 2020 fmriprep updated to version 20.0.5
A Robust Preprocessing Pipeline for fMRI Data
23 Mar 2020 cmdstan updated to version 2.21.0
Command line interface to stan
20 Mar 2020 flye updated to version 2.7
Fast and accurate de novo assembler for single molecule sequencing reads
20 Mar 2020 bowtie2 updated to version 2.4.1
A version of bowtie that's particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes
20 Mar 2020 deepvariant updated to version 0.9.0
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
20 Mar 2020 MCL updated to version 14-137
MCL implements Markov cluster algorithm. Among its applications is the assignment of proteins into families based on precomputed sequence similarity information. This approach does not suffer from the problems that normally hinder other protein sequence clustering algorithms, such as the presence of multi-domain proteins, promiscuous domains and fragmented proteins.
20 Mar 2020 SOAPdenovo-Trans updated to version 1.04
SOAPdenovo-Trans is a de novo transcriptome assembler designed specifically for RNA-Seq. Its performance on transcriptome datasets from rice and mouse. It provides higher contiguity, lower redundancy and faster execution than other popular transcriptome assemblers.
20 Mar 2020 kallisto updated to version 0.46.2
kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.
19 Mar 2020 rilseq updated to version 0.74
RILseq computational protocol
19 Mar 2020 LongRanger updated to version 2.2.2
Long Ranger is a set of analysis pipelines that processes GemCode sequencing output to align reads and call and phase SNPs, indels, and structural variants Loupe is a genome browser designed to visualize the Linked-Read data produced by the 10x Chromium Platform.
19 Mar 2020 genometools updated to version 1.6.1
collection of bioinformatic tools
19 Mar 2020 shapeit updated to version 4.1.3
SHAPEIT is a fast and accurate haplotype inference software
19 Mar 2020 QIIME updated to version 2-2020.2
QIIME is an open source software package for comparison and analysis of microbial communities, primarily based on high-throughput amplicon sequencing data (such as SSU rRNA) generated on a variety of platforms, but also supporting analysis of other types of data (such as shotgun metagenomic data).
19 Mar 2020 OpenSlide updated to version 3.4.1
OpenSlide is a C library for reading and manipulating digital slides of diverse vendor formats. It provides a simple interface to read whole-slide images (also known as virtual slides). OpenSlide has been used in the digital pathology projects.
19 Mar 2020 BEAST updated to version 1.10.4,2.6.2
BEAST (Bayesian Evolutionary Analysis Sampling Trees) is a cross-platform program for Bayesian MCMC analysis of molecular sequences.
19 Mar 2020 Huygens updated to version 19.10
Huygens is an image restoration, deconvolution, resolution and noise reduction. It can process images from all current optical microscopes, including wide-field, confocal, Nipkow (scanning disk confocal), multiple-photon, and 4Pi microscopes.
18 Mar 2020 Canu updated to version 2.0
Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION). Canu will correct the reads, then trim suspicious regions (such as remaining SMRTbell adapter), then assemble the corrected and cleaned reads into unitigs.
18 Mar 2020 ChromHMM updated to version 1.20
ChromHMM is software for learning and characterizing chromatin states.
17 Mar 2020 tandemtools updated to version current
Tool for assessing/improving assembly quality in extra-long tandem repeats
17 Mar 2020 deeptools updated to version 3.4.1
deepTools is a suite of user-friendly tools for the visualization, quality control and normalization of data from deep-sequencing DNA sequencing experiments.
13 Mar 2020 PyMOL updated to version 2.3.0
A comprehensive molecular visualization product for rendering and animating 3D molecular structures.
13 Mar 2020 OpenBabel updated to version 3.0.0
Open Babel is a chemical toolbox designed to speak the many languages of chemical data.
10 Mar 2020 HTGTSrep updated to version 9fe74ff
A pipeline for comprehensive analysis of HTGTS-Rep-seq.
10 Mar 2020 m2clust updated to version 0.0.7
m2clust provides an elegant clustering approach to find clusters in data sets with different density and resolution.
6 Mar 2020 git updated to version 2.25.1
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
4 Mar 2020 GimmeMotifs updated to version 0.14.3
GimmeMotifs is a pipeline for transcription factor motif analysis written in Python. It incorporates an ensemble of computational tools to predict motifs de novo from ChIP-sequencing data. Similar redundant motifs are compared using the weighted information content similarity score and clustered using an iterative procedure. A comprehensive output report is generated with several different evaluation metrics to compare and evaluate the results.
3 Mar 2020 varsim updated to version 0.8.5
A high-fidelity simulation validation framework for high-throughput genome sequencing with cancer applications
2 Mar 2020 raxml-ng updated to version 0.9.0
RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion. Its search heuristic is based on iteratively performing a series of Subtree Pruning and Regrafting (SPR) moves, which allows to quickly navigate to the best-known ML tree. Successor to raxml.
2 Mar 2020 plink updated to version 2.3-alpha
PLINK is whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
26 Feb 2020 hisat updated to version 2.2.1.0-sra2.10.4
HISAT is a fast and sensitive spliced alignment program which uses Hierarchical Indexing for Spliced Alignment of Transcripts.
26 Feb 2020 sratoolkit updated to version 2.10.4
The NCBI SRA Toolkit enables reading ("dumping") of sequencing files from the SRA database and writing ("loading") files into the .sra format.
26 Feb 2020 ncbi-vdb updated to version 2.10.4
The SRA Toolkit and SDK from NCBI is a collection of tools and libraries for using data in the INSDC Sequence Read Archives.
26 Feb 2020 ncbi-ngs updated to version 2.10.4
NCBI's NGS is a new, domain-specific API for accessing reads, alignments and pileups produced from Next Generation Sequencing
26 Feb 2020 vasttools updated to version 2.3.0
A toolset for profiling alternative splicing events in RNA-Seq data.
25 Feb 2020 SimNIBS updated to version 3.1.1
SimNIBS is a free software package for the Simulation of Non-invasive Brain Stimulation. It allows for realistic calculations of the electric field induced by transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS).
25 Feb 2020 DeepLabCut updated to version 2.1
DeepLabCut is an open source toolbox that builds on a state-of-the-art human pose estimation algorithm. It allows training of a deep neural network by using limited training data to precisely track user-defined features, so that the human labeling accuracy will be matched.
25 Feb 2020 htgts updated to version 2
High-Throughput Genome-Wide Translocation Sequencing pipeline
25 Feb 2020 uropa updated to version 3.5.0
UROPA is a command line based tool for genomic region annotation
24 Feb 2020 sysbench updated to version 1.0.11
sysbench is a scriptable multi-threaded benchmark tool based on LuaJIT. It is most frequently used for database benchmarks, but can also be used to create arbitrarily complex workloads that do not involve a database server.
24 Feb 2020 PRSice updated to version 2.2.12
PRSice is a Polygenic Risk Score software for calculating, applying, evaluating and plotting the results of polygenic risk scores (PRS) analyses.
24 Feb 2020 TORTOISE updated to version 3.2.0
(Tolerably Obsessive Registration and Tensor Optimization Indolent Software Ensemble) The TORTOISE software package is for processing diffusion MRI data.
24 Feb 2020 boost updated to version 1.72
Boost provides free peer-reviewed portable C++ source libraries. Boost libraries are intended to be widely useful, and usable across a broad spectrum of applications.
24 Feb 2020 parallel updated to version 20200222
GNU parallel is a shell tool for executing jobs in parallel using one or more computers.
20 Feb 2020 DNAnexus updated to version 0.290.1
DNAnexus is a cloud-based commercial solution for next-generation sequence analysis and visualization. It has a command-line interface (CLI) which can be used to log in to the DNAnexus platform, upload and navigate data, and launch analyses.
20 Feb 2020 YOLO updated to version 20200211
YOLO is a new approach to object detection. Prior work on object detection repurposed classifiers to perform detection. Instead, YOLO frames object detection as a regression problem to spatially separated bounding boxes and associated class probabilities.
20 Feb 2020 AdmixTools updated to version 6.0
ADMIXTOOLS is a software package that supports formal tests of whether admixture occurred, and makes it possible to infer admixture proportions and dates.
20 Feb 2020 OpenCRAVAT updated to version 1.7.0
OpenCRAVAT is a new open source, scalable decision support system for variant and gene prioritization. It includses a modular resource catalog to maximize community and developer involvement, and as a result the catalog is being actively developed and growing every month. Resources made available via the store are well-suited for analysis of cancer, as well as Mendelian and complex diseases.
19 Feb 2020 singularity updated to version 3.5.3
Singularity is a container platform focused on supporting ``Mobility of Compute``. It allows users to emulate, and share custom Linux environments allowing for the creation of self-contained development stacks.
19 Feb 2020 WISExome updated to version 20180814
WISExome is the tool that implements a within-sample comparison approach to CNV detection. It correctly identifies known pathogenic CNVs.
18 Feb 2020 fusioninspector updated to version 2.2.1
In silico Validation of Fusion Transcript Predictions
15 Feb 2020 nodejs updated to version 12.16.0
Node.js is a JavaScript runtime built on Chrome's V8 JavaScript engine. module name: nodejs
13 Feb 2020 Comsol updated to version 55
The COMSOL Multiphysics engineering simulation software environment facilitates all steps in the modeling process − defining your geometry, meshing, specifying your physics, solving, and then visualizing your results.
13 Feb 2020 VEP updated to version 99
VEP (Variant Effect Predictor) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.
12 Feb 2020 mocat2 updated to version current
a package for analyzing metagenomics datasets
12 Feb 2020 Genome Browser updated to version 393
The Genome Browser Mirror Fragments is a mirror of the UCSC Genome Browser. The URL is https://hpcnihapps.cit.nih.gov/genome. Users can also access the MySQL databases, supporting files directly, and a huge number of associated executables.
11 Feb 2020 ORCA updated to version 4.2.1
ORCA is an ab initio, DFT, and semi-empirical SCF-MO package.
11 Feb 2020 encode-atac-seq-pipeline updated to version 1.6.1
This pipeline is designed for automated end-to-end quality control and processing of ATAC-seq or DNase-seq data.
10 Feb 2020 vt updated to version 0.57721
vt is a variant tool set that discovers short variants from Next Generation Sequencing data.
6 Feb 2020 diamond updated to version 0.9.30
DIAMOND is a new high-throughput program for aligning DNA reads or protein sequences against a protein reference database such as NR, at up to 20,000 times the speed of BLAST, with high sensitivity.
6 Feb 2020 crystfel updated to version 0.9.0
CrystFEL is a suite of programs for processing diffraction data acquired serially in a snapshot manner, such as when using the technique of Serial Femtosecond Crystallography (SFX) with a free-electron laser source.
6 Feb 2020 dashing updated to version 0.4.2
Fast and accurate genomic distances using HyperLogLog
6 Feb 2020 cmake updated to version 3.16.4
CMake is a family of tools designed to build, test and package software.
6 Feb 2020 hint updated to version 2.27
a computational method to detect CNVs and Translocations from Hi-C data.
5 Feb 2020 pdf2svg updated to version 0.2.3
A simple PDF to SVG converter using the Poppler and Cairo libraries.
4 Feb 2020 biom-format updated to version 2.1.8
tool (and library) to manipulate Biological Observation Matrix (BIOM) Format files
4 Feb 2020 baracus updated to version 1.1.4
Baracus predicts brain age, based on data from Freesurfer. It combines data from cortical thickness, cortical surface area, and subcortical information
3 Feb 2020 guppy updated to version 3.4.5
Local accelerated basecalling for Nanopore data
31 Jan 2020 busco updated to version 4.0.2
BUSCO completeness assessments employ sets of Benchmarking Universal Single-Copy Orthologs from OrthoDB (www.orthodb.org) to provide quantitative measures of the completeness of genome assemblies, annotated gene sets, and transcriptomes in terms of expected gene content.
31 Jan 2020 netpbm updated to version 10.86.8
Netpbm is a toolkit for manipulation of graphic images, including conversion of images between a variety of different formats. There are over 300 separate tools in the package including converters for about 100 graphics formats. Examples of the sort of image manipulation we're talking about are: Shrinking an image by 10%; Cutting the top half off of an image; Making a mirror image; Creating a sequence of images that fade from one image to another.
30 Jan 2020 Hail updated to version 0.2.31
Hail is an open-source, scalable framework for exploring and analyzing genomic data.
30 Jan 2020 vsearch updated to version 2.14.2
VSEARCH supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
26 Jan 2020 BioGANs updated to version 20191230
BioGANs is a novel application of Generative Adversarial Networks (GAN) to the synthesis of cells imaged by fluorescence microscopy. It allows to infer the correlation between the spatial pattern of different fluorescent proteins that reflects important biological functions. The synthesized images capture these relationships, which are relevant for biological applications.
24 Jan 2020 deepsea updated to version 0.94c
DeepSEA is a deep learning-based algorithmic framework for predicting the chromatin effects of sequence alterations with single nucleotide sensitivity. DeepSEA can accurately predict the epigenetic state of a sequence, including transcription factors binding, DNase I sensitivities and histone marks in multiple cell types, and further utilize this capability to predict the chromatin effects of sequence variants and prioritize regulatory variants.
23 Jan 2020 gurobi updated to version 9.0.0
Gurobi is a mathematical optimization solver. It is a commercial product developed by gurobi.com. On Biowulf, Gurobi is licensed for use by the members of the CDSL_Gurobi_users group only. It is installed in /data/CDSL_Gurobi_users and is not accessible by any other users. A token license server, running on Biowulf, manages the Gurobi license.
22 Jan 2020 repeatmodeler updated to version 2.0.1
RepeatModeler is a de novo transposable element (TE) family identification and modeling package. RepeatModeler assists in automating the runs of the various algorithms given a genomic database, clustering redundant results, refining and classifying the families and producing a high quality library of TE families suitable for use with RepeatMasker and ultimately for submission to the Dfam database (http://dfam.org).
22 Jan 2020 MAKER updated to version 2.31.10
MAKER is a portable and easily configurable genome annotation pipeline. Its purpose is to allow smaller eukaryotic and prokaryotic genome projects to independently annotate their genomes and to create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values.
21 Jan 2020 guidance updated to version 2.02
GUIDANCE is meant to be used for weighting, filtering or masking unreliably aligned positions in sequence alignments before subsequent analysis.
17 Jan 2020 Gaussian updated to version G16-C01
Gaussian is a connected system of programs for performing semiempirical and ab initio molecular orbital (MO) calculations.
16 Jan 2020 Xvfb updated to version 1.19.6
X virtual frame buffer.
15 Jan 2020 LDpred updated to version 1.0.11
LDpred is a Python based software package that adjusts GWAS summary statistics for the effects of linkage disequilibrium (LD).
15 Jan 2020 medaka updated to version 0.11.4
medaka is a tool to create a consensus sequence from nanopore sequencing data. This task is performed using neural networks applied from a pileup of individual sequencing reads against a draft assembly.
15 Jan 2020 cromwell updated to version 48
A Workflow Management System geared towards scientific workflows.
13 Jan 2020 trinity updated to version 2.9.0
Trinity, developed at the Broad Institute and the Hebrew University of Jerusalem, represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.
10 Jan 2020 cgpBattenberg updated to version 3.5.3
Detect subclonality and copy number in matched NGS data
10 Jan 2020 freebayes updated to version 1.3.2
Bayesian haplotype-based polymorphism discovery and genotyping
10 Jan 2020 mosdepth updated to version 0.2.8
Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.
10 Jan 2020 scallop updated to version 0.10.4
Scallop is a reference-based transcript assembler.
10 Jan 2020 Connectome Workbench updated to version 1.4.2
Tools to browse, download, explore, and analyze data from the Human Connectome Project (HCP). Allows users to compare their own data to that of the HCP.
10 Jan 2020 bbtools updated to version 38.75
An extensive set of bioinformatics tools including bbmap (short read aligner), bbnorm (kmer based normalization), dedupe (deduplication and clustering of unaligned reads), reformat (formatting and trimming reads) and many more.
10 Jan 2020 atom updated to version 1.42.0
A hackable text editor for the 21st Century.
8 Jan 2020 Qt updated to version 5.14.0
Qt is a cross-platform application framework that is used for developing application software that can be run on various software and hardware platforms with little or no change in the underlying codebase, while still being a native application with native capabilities and speed.
8 Jan 2020 Schrodinger updated to version 2019.4
A limited number of Schrödinger applications are available on the Biowulf cluster through the Molecular Modeling Interest Group. Most are available through the Maestro GUI.
6 Jan 2020 PartekFlow updated to version 9.0.19.1222
Web interface designed specifically for the analysis needs of next generation sequencing applications including RNA, small RNA, and DNA sequencing.
3 Jan 2020 pigz updated to version 2.4
pigz (parallel implementation of gzip) is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data.
3 Jan 2020 metal updated to version 2018-08-28
The METAL software is designed to facilitate meta-analysis of large datasets (such as several whole genome scans) in a convenient, rapid and memory efficient manner.
2 Jan 2020 racon updated to version 1.4.3
Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads.
Scientific Databases updated in last 3 months
For a full list of scientific databases available on the NIH HPC systems, see this page

Updated Database Format Type Location
31 Mar 2020NCBI Taxonomytaxonomy /fdb/taxonomy
30 Mar 2020Cambridge Structural DatabaseCSD 3-D/usr/local/apps/CSD
24 Mar 2020NCBI ntBlast Nuc/fdb/blastdb/nt
24 Mar 2020NCBI nrBlast Prot/fdb/blastdb/nr
24 Mar 2020Protein Data BankBlast Prot/fdb/blastdb/pdbaa
24 Mar 2020SwissProtBlast Prot/fdb/blastdb/swissprot
16 Feb 2020Mouse Genome (Mus musculus) mm8MySQL NIH mirror of UCSC Genome Browser
12 Feb 2020Mouse Genome GRCm38.p6 proteinsBlast Prot/fdb/blastdb/GRCm38.p6.prot
12 Feb 2020Mouse Genome GRCm38.p6Blast Nuc/fdb/blastdb/GRCm38.p6
12 Feb 2020Human Genome GRCh38.p13 proteinsBlast Prot/fdb/blastdb/GRCh38.p13.prot
12 Feb 2020Human Genome GRCh38.p13Blast Nuc/fdb/blastdb/GRCh38.p13
12 Feb 2020Mouse Genome GRCm38.p6 proteinsFasta Prot/fdb/genome/GRCm38.p6
12 Feb 2020Mouse Genome GRCm38.p6Fasta Nuc/fdb/genome/GRCm38.p6
12 Feb 2020Human Genome hg19Fasta Nuc/fdb/genome/human-feb2009/
10 Feb 2020NCBI nrBlast_v4 Prot/fdb/blastdb/v4/nr
29 Jan 2020NCBI ntBlast_v4 Nuc/fdb/blastdb/v4/nt
21 Jan 2020ANNOVARANNOVAR /fdb/annovar/current