Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 6;53(D1):D748-D756.
doi: 10.1093/nar/gkae959.

BacDive in 2025: the core database for prokaryotic strain data

Affiliations

BacDive in 2025: the core database for prokaryotic strain data

Isabel Schober et al. Nucleic Acids Res. .

Abstract

In 2025, the bacterial diversity database BacDive is the leading database for strain-level bacterial and archaeal information. It has been selected as an ELIXIR Core Data Resource as well as a Global Core Biodata Resource. Since its initial release more than ten years ago, BacDive (https://bacdive.dsmz.de) has grown tremendously in content and functionalities, and is a comprehensive resource covering the phenotypic diversity of prokaryotes with data on taxonomy, morphology, physiology, cultivation, and more. The current release (2023.2) contains 2.6 million data points on 97 334 strains, reflecting an increase by 52% since the previous publication in 2021. This remarkable growth can largely be attributed to the integration of the world-wide largest collection of Analytical Profile Index (API) test results, which are now fully integrated into the database and searchable. A novel BacDive knowledge graph provides powerful search options through a SPARQL endpoint, including the possibility for federated searches across multiple data sources. The high-quality data provided by BacDive is increasingly being used for the training of artificial intelligence models and resulting genome-based predictions with high confidence are now used to fill content gaps in the database.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
New features on the BacDive strain detail page. (A) Analytical Profile Index (API) test results are now not only shown in specific API tables (bottom), but also integrated alongside manually curated data. (B) Geographic locations and types of very closely related samples with at least 99% 16S rRNA gene sequence identity are shown as provided by Microbeatlas. (C) The sidebar now links to related StrainInfo and PhageDive entries as well as BacDive Special Collections that the strain can be found in. (D) Genome-based predictions with over 90% confidence can be found integrated along experimental data in the respective sections, clearly marked with an AI icon. (E) All predicted data are listed in a new section along with the relevant information. (F) A new literature section lists publications metadata retrieved from PubMed and automatically matched to the strain via culture collection numbers and taxonomy.
Figure 2.
Figure 2.
Special collection dashboard. A short description, list of strains and selected statistics give an overview of a special collection. The sidebar on the top right allows for easy switching between different collections.
Figure 3.
Figure 3.
Interface to the SPARQL end point with an example of a federated SPARQL query executed on the BacDive database. The query dynamically constructs URLs for taxonomy IDs from BacDive to fetch relevant protein data, showcasing the integration of microbiological and protein sequence data through a federated query approach. The results display strains, their names, taxonomy IDs, associated proteins, and partial amino acid sequences, demonstrating the powerful capability of federated queries in combining data from disparate sources. The visible result is limited to 1000 entries, but the full set can be downloaded.
Figure 4.
Figure 4.
Data points added by predictions. Number of BacDive strains with data for the traits oxygen tolerance, Gram stain, motility and spore formation. Dark blue sections show numbers of strains for which only experimental values are present in the database, striped sections visualize strains for which predictions were integrated next to existing experimental data and light blue sections represent strains for which integrated predictions added information where none was previously present.

References

    1. Palma T.L., Costa M.C.. Biodegradation of 17α-ethinylestradiol by strains of Aeromonas genus isolated from acid mine drainage. Clean Technol. 2024; 6:116–139.
    1. Kapoor A., Varshney C.. Microbial degradation of PET plastic sustainably yielding commercially viable products. 2021; Preprints doi:21 June 2021, pre-print: not peer-reviewed10.20944/preprints202106.0519.v1. - DOI
    1. Chopra A., Franco-Duarte R., Rajagopal A., Choowong P., Soares P., Rito T., Eberhard J., Jayasinghe T.N.. Exploring the presence of oral bacteria in non-oral sites of patients with cardiovascular diseases using whole metagenomic data. Sci. Rep. 2024; 14:1476. - PMC - PubMed
    1. da Silva Santos D., Freitas N.S.A., de Morais M.A., Mendonça A.A.. Liquorilactobacillus: a context of the evolutionary history and metabolic adaptation of a bacterial genus from fermentation liquid environments. J. Mol. Evol. 2024; 92:467–487. - PubMed
    1. Seo H., Kim J.H., Lee S.-M., Lee S.-W.. The plant-associated Flavobacterium: a hidden helper for improving plant health. Plant Pathol J. 2024; 40:251–260. - PMC - PubMed

LinkOut - more resources