We provide a set of centrally-maintained scientific reference databases for Biowulf users. You can search through this data here. To request a new database or an update, please contact us at staff@hpc.nih.gov.
Search by keyword | Searches through metadata using keywords |
Search by filename | Searches through filenames where available |
2024-04-16 | taxonomy | The Taxonomy Database is a curated classification and nomenclature for all of the organisms in the public sequence databases. |
2024-04-02 | TomoTwin Models | Models for the particle picking software TomoTwin |
2024-03-21 | dorado models | Models for the dorado basecaller by ONT |
2024-03-15 | Reference data for the cellranger pipeline | References for the 10x Genomics cellranger pipeline |
2024-03-13 | VEP | VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions. |
2024-03-03 | dfam | The Dfam database is a open collection of Transposable Element DNA sequence alignments, hidden Markov Models (HMMs), consensus sequences, and genome annotations. |
2024-02-19 | I-TASSER ITLIB | I-TASSER Template Library for Protein Structure and Function Prediction |
2024-02-03 | Betacoronavirus | Blast database of Betacoronavirus nucleotide sequences. (Blast database full path and name - /fdb/blastdb/Betacoronavirus) |
2024-02-03 | PDB protein sequences Blast db | Protein Data Bank sequences. (Blast database full path and name - /fdb/blastdb/pdbaa ) |
2024-02-03 | Swissprot Blast database | Curated, highly-annotated protein sequence database (Blast database full path and name - /fdb/blastdb/swissprot ) |
2024-01-30 | NCBI nt Blast database | NCBI nonredundant comprehensive nucleotide database, compiled from Genbank, Refseq, TPA and PDB. (Blast database full path and name - /fdb/blastdb/nt ) |
2024-01-29 | NCBI nr Blast database | NCBI nonredundant comprehensive protein database, compiled from GenBank CDS translations, PDB, Swiss-Prot, PIR, and PRF (Blast database full path and name - /fdb/blastdb/nr ) |
2024-01-27 | PDB nucleotide sequences Blast db | Protein Data Bank nucleotide sequences. (Blast database full path and name - /fdb/blastdb/pdbnt ) |
2024-01-25 | ensembl | Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation. |
2024-01-21 | colabfold sequence databases | Databases used by colabfold for MSA generation with mmseqs2. |
2024-01-16 | Patent nucleotide sequences Blast db | Patent nucleotide sequences (Blast database full path and name - /fdb/blastdb/patnt ) |
2024-01-12 | VICTOR Reference Data | Reference data for VICTOR application including GRCh37 and GRCh38. Also includes precomputed BayesDel meta-scores. |
2023-12-20 | Illumina iGenomes | Ready-To-Use Reference Sequences and Annotations from Illumina for commonly analyzed organisms. |
2023-12-11 | gnomAD | The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators, with the goal of aggregating and harmonizing both exome and genome sequencing data from a wide variety of large-scale sequencing projects, and making summary data available for the wider scientific community. |
2023-09-07 | CheckM2 reference data | Data used by CheckM2 to asses the quality of genome assemblies |