Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 30;114(22):5601-5606.
doi: 10.1073/pnas.1614680114. Epub 2017 May 1.

Retrospective analysis of natural products provides insights for future discovery trends

Affiliations

Retrospective analysis of natural products provides insights for future discovery trends

Cameron R Pye et al. Proc Natl Acad Sci U S A. .

Abstract

Understanding of the capacity of the natural world to produce secondary metabolites is important to a broad range of fields, including drug discovery, ecology, biosynthesis, and chemical biology, among others. Both the absolute number and the rate of discovery of natural products have increased significantly in recent years. However, there is a perception and concern that the fundamental novelty of these discoveries is decreasing relative to previously known natural products. This study presents a quantitative examination of the field from the perspective of both number of compounds and compound novelty using a dataset of all published microbial and marine-derived natural products. This analysis aimed to explore a number of key questions, such as how the rate of discovery of new natural products has changed over the past decades, how the average natural product structural novelty has changed as a function of time, whether exploring novel taxonomic space affords an advantage in terms of novel compound discovery, and whether it is possible to estimate how close we are to having described all of the chemical space covered by natural products. Our analyses demonstrate that most natural products being published today bear structural similarity to previously published compounds, and that the range of scaffolds readily accessible from nature is limited. However, the analysis also shows that the field continues to discover appreciable numbers of natural products with no structural precedent. Together, these results suggest that the development of innovative discovery methods will continue to yield compounds with unique structural and biological properties.

Keywords: chemical diversity; chemoinformatics; drug discovery; natural products; structural similarity.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Examining structural diversity. (A) Number of compounds published per year and rate of novel compound isolation as a percentage of total natural product isolation. (B) Median maximum Tanimoto scores as a function of time. Median average deviation shown as shaded blue region. (C) Absolute number of low similarity compounds (T < 0.4) per year. NP, natural product.
Fig. 2.
Fig. 2.
Examining source diversity. (A) Plot of median maximum Tanimoto score by year for the full dataset (blue) and the intrasubgroup values for the cyanobacterial subgroup (red). (B) Plot of intrasubgroup median maximum Tanimoto scores by year for bacterial subgroups. (C) Plot of intrasubgroup median maximum Tanimoto scores by year for marine subgroups. (D) Plot of extrasubgroup median maximum Tanimoto scores by year for marine subgroups. (E) Violin plots for intrasubgroup median maximum Tanimoto scores for bacterial and marine subgroups. (F) Violin plots for extrasubgroup median maximum Tanimoto scores for bacterial and marine subgroups. Med., median.
Fig. 3.
Fig. 3.
Theoretical vs. actual structural diversity. (A) Examples of the four major classes of cyclic tetrapeptides found in nature. (B) Violin plots indicating the distribution of Tanimoto scores between all members of 65 randomly selected theoretical cyclic peptides (10 trials, lanes 1–10) and between all 65 cyclic tetrapeptides from our natural product dataset (lane REAL). Ala, alanine; Ile, isoleucine; Leu, leucine; Med., median; Val, valine.
Fig. S1.
Fig. S1.
Cyclic tetrapeptide structures from natural product dataset.
Fig. S2.
Fig. S2.
Cluster analysis for natural product diversity. (A) Network diagram displaying all molecules as clusters based on Tversky structural similarities. Compounds with no structural similarity partners appear as singletons in the bottom region of the figure. (B) Expansion of region of network diagram indicating erythromycin compound class. (C) Example structure from erythromycin cluster.
Fig. S3.
Fig. S3.
Examples of natural products with low (<0.4) Tanimoto scores, indicating compound name, source, year of discovery, and isolation method.

Comment in

References

    1. Newman DJ, Cragg GM. Natural products as sources of new drugs from 1981 to 2014. J Nat Prod. 2016;79:629–661. - PubMed
    1. Harvey AL, Edrada-Ebel R, Quinn RJ. The re-emergence of natural products for drug discovery in the genomics era. Nat Rev Drug Discov. 2015;14:111–129. - PubMed
    1. Gwynn MN, Portnoy A, Rittenhouse SF, Payne DJ. Challenges of antibacterial discovery revisited. Ann N Y Acad Sci. 2010;1213:5–19. - PubMed
    1. Kong D-X, Guo M-Y, Xiao Z-H, Chen L-L, Zhang H-Y. Historical variation of structural novelty in a natural product library. Chem Biodivers. 2011;8:1968–1977. - PubMed
    1. Walsh CT. A chemocentric view of the natural product inventory. Nat Chem Biol. 2015;11:620–624. - PubMed

Publication types