Scientific Reference Data

We provide a set of centrally-maintained scientific reference databases for Biowulf users. You can search through this data here. To request a new database or an update, please contact us at staff@hpc.nih.gov.


OR

Search by keywordSearches through metadata using keywords
Search by filenameSearches through filenames where available


Browse Common Databases

Recently Updated:

2025-11-05 ensembl Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation.
2025-08-23 Betacoronavirus Blast database of Betacoronavirus nucleotide sequences. (Blast database full path and name - /fdb/blastdb/Betacoronavirus)
2025-08-23 NCBI nr Blast database NCBI nonredundant comprehensive protein database, compiled from GenBank CDS translations, PDB, Swiss-Prot, PIR, and PRF (Blast database full path and name - /fdb/blastdb/nr )
2025-08-23 NCBI nt Blast database NCBI nonredundant comprehensive nucleotide database, compiled from Genbank, Refseq, TPA and PDB. (Blast database full path and name - /fdb/blastdb/nt )
2025-08-23 taxonomy The Taxonomy Database is a curated classification and nomenclature for all of the organisms in the public sequence databases.
2025-08-19 Patent nucleotide sequences Blast db Patent nucleotide sequences (Blast database full path and name - /fdb/blastdb/patnt )
2025-08-19 PDB nucleotide sequences Blast db Protein Data Bank nucleotide sequences. (Blast database full path and name - /fdb/blastdb/pdbnt )
2025-08-19 PDB protein sequences Blast db Protein Data Bank sequences. (Blast database full path and name - /fdb/blastdb/pdbaa )
2025-08-19 Swissprot Blast database Curated, highly-annotated protein sequence database (Blast database full path and name - /fdb/blastdb/swissprot )
2025-08-18 NCBI SRA Refseq data NCBI SRA Refseq data
2025-08-17 Protein Data Bank An archive of experimentally determined three-dimensional structures of biological macromolecules.
2025-08-14 Human Pangenome HPRC Assemblies Release 2 Assembly data from the Human Pangenome Reference Consortium, release version 2
2025-07-31 RepBase Repbase is the most commonly used database of repetitive DNA elements.
2025-06-29 UCSC goldenPath The UCSC Genomics Institute maintains a broad collection of vertebrate and model organism assemblies and annotations, along with a large suite of tools for viewing, analyzing and downloading data.
2025-06-24 GENCODE The goal of the GENCODE project is to identify and classify all gene features in the human and mouse genomes with high accuracy based on biological evidence, and to release these annotations for the benefit of biomedical research and genome interpretation.
2025-06-24 UCSC gbdb The UCSC Genomics Institute maintains a broad collection of vertebrate and model organism assemblies and annotations, along with a large suite of tools for viewing, analyzing and downloading data.
2025-05-09 annovar ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes.
2025-04-29 diamond databases Select reference databases for the diamond application
2025-04-24 dali The one million structures from AlphaFold2 are available in this release.
2025-04-17 dbNSFP dbNSFP is a database developed for functional prediction and annotation of all potential non-synonymous single-nucleotide variants (nsSNVs) in the human genome.