Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Jan 1;38(1):264-278.
doi: 10.1039/d0np00053a. Epub 2020 Aug 28.

Microbial natural product databases: moving forward in the multi-omics era

Affiliations
Review

Microbial natural product databases: moving forward in the multi-omics era

Jeffrey A van Santen et al. Nat Prod Rep. .

Abstract

Covering: 2010-2020The digital revolution is driving significant changes in how people store, distribute, and use information. With the advent of new technologies around linked data, machine learning and large-scale network inference, the natural products research field is beginning to embrace real-time sharing and large-scale analysis of digitized experimental data. Databases play a key role in this, as they allow systematic annotation and storage of data for both basic and advanced applications. The quality of the content, structure, and accessibility of these databases all contribute to their usefulness for the scientific community in practice. This review covers the development of databases relevant for microbial natural product discovery during the past decade (2010-2020), including repositories of chemical structures/properties, metabolomics, and genomic data (biosynthetic gene clusters). It provides an overview of the most important databases and their functionalities, highlights some early meta-analyses using such databases, and discusses basic principles to enable widespread interoperability between databases. Furthermore, it points out conceptual and practical challenges in the curation and usage of natural products databases. Finally, the review closes with a discussion of key action points required for the field moving forward, not only for database developers but for any scientist active in the field.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest

MHM is a co-founder of Design Pharmaceuticals and a member of the scientific advisory board of Hexagon Bio.

Figures

Fig. 1:
Fig. 1:
Timeline of data distribution methods for natural products.
Fig. 2:
Fig. 2:
A) Distribution of compound source types in selected natural products databases. B) Distribution of biosynthetic gene cluster source types in selected biosynthetic gene cluster databases C) Overlap of microbial natural product InChIKey structure representations between open access databases. Microbial database overlap was calculated using the unique sets of the InChIKey connectivity hashes from each database. This decreases the compound count in each database because sets of configurational isomers are reduced to single flat structures: NP Atlas 25,523 to 23,927, NPASS 8,729 to 8,096, and StreptomeDB 7,125 to 6,283. The Proportional Venn Diagram was created using eulerAPE v3.
Fig. 3:
Fig. 3:
Data types and their relative accessibility from published articles in the primary scientific literature

References

    1. Schulz H, Georgy U, Schulz H and Georgy U, in From CA to CAS online, Springer Berlin; Heidelberg, 1994, pp. 118–123.
    1. Turner WB, Fungal Metabolites (Volume 1), Academic Press Inc, 1971.
    1. Turner WB, Fungal Metabolites (Volume 2), Academic Press Inc, 1983.
    1. Bérdy J, CRC Handbook of Antibiotic Compounds, CRC Press, Boca Raton, Fla, 1980.
    1. Umezawa H, Index of Antibiotics From Actinomycetes (Volume 1), University Park Press, 1967.

Publication types

LinkOut - more resources