Biowulf High Performance Computing at the NIH
Scientific Reference Data

We provide a set of centrally-maintained scientific reference databases for Biowulf users. You can search through this data here. To request a new database or an update, please contact us at staff@hpc.nih.gov.


OR

Search by keywordSearches through metadata using keywords
Search by filenameSearches through filenames where available


Browse Common Databases

Recently Updated:

2022-07-05 Betacoronavirus Blast database of Betacoronavirus nucleotide sequences. (Blast database full path and name - /fdb/blastdb/Betacoronavirus)
2022-07-05 NCBI nt Blast database NCBI nonredundant comprehensive nucleotide database, compiled from Genbank, Refseq, TPA and PDB. (Blast database full path and name - /fdb/blastdb/nt )
2022-07-05 Patent nucleotide sequences Blast db Patent nucleotide sequences (Blast database full path and name - /fdb/blastdb/patnt )
2022-07-05 PDB nucleotide sequences Blast db Protein Data Bank nucleotide sequences. (Blast database full path and name - /fdb/blastdb/pdbnt )
2022-07-05 taxonomy The Taxonomy Database is a curated classification and nomenclature for all of the organisms in the public sequence databases.
2022-07-04 NCBI nr Blast database NCBI nonredundant comprehensive protein database, compiled from GenBank CDS translations, PDB, Swiss-Prot, PIR, and PRF (Blast database full path and name - /fdb/blastdb/nr )
2022-07-02 I-TASSER ITLIB I-TASSER Template Library for Protein Structure and Function Prediction
2022-07-02 PDB protein sequences Blast db Protein Data Bank sequences. (Blast database full path and name - /fdb/blastdb/pdbaa )
2022-07-02 Swissprot Blast database Curated, highly-annotated protein sequence database (Blast database full path and name - /fdb/blastdb/swissprot )
2022-06-23 STAR indices STAR indices build with various assemblies/annotations and overhangs for different versions of STAR. Updated as needed and on request.
2022-06-04 Standard databases for foldseek foldseek provides prebuilt databases for AlphafoldDB (Swiss-Prot and Proteome) as well as PDB.
2022-05-24 T2T Genome Assembly Genomic data released by the Telomore-to-Telomere Consortium
2022-05-23 ClinVar ClinVar aggregates information about genomic variation and its relationship to human health. The latest monthly XML and weekly vcf files are made available on request.
2022-05-05 gmap-gsnap_refdata-Mus_musculus/UCSC/mm10 GMAP: A Genomic Mapping and Alignment Program for mRNA and EST Sequences, and GSNAP: Genomic Short-read Nucleotide Alignment Program
2022-05-05 gmap-gsnap_refdata-Mus_musculus/UCSC/mm9 GMAP: A Genomic Mapping and Alignment Program for mRNA and EST Sequences, and GSNAP: Genomic Short-read Nucleotide Alignment Program
2022-05-04 NCBI SRA Refseq data NCBI SRA Refseq data
2022-05-03 Clairvoyante Clairvoyante implements a multitask five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads.
2022-05-02 biobakery_workflows bioBakery is a metaomic analysis environment and collection of individual software tools with the capacity to process raw shotgun sequencing data into actionable microbial community feature profiles, summary reports, and publication-ready figures. It includes a collection of preconfigured analysis modules also joined into workflows for reproducibility. Each individual module has been developed to perform a particular task, e.g. quantitative taxonomic profiling or statistical analysis.
2022-04-26 SignalP Model Weights Signal peptide prediction model weights based on a Bert protein language model encoder and a conditional random field (CRF) decoder.
2022-04-22 alphafold2 sequence databases and templates Alphafold draws on several data sources - BFD, Uniref, MGnify, PDB, etc. Databases are assembled assembled according to the alphafold2 installation instructions and updated in place as newer versions of individual datasources become available. This directory also includes the model weights. Update details are available in our alphafold documentation.