Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Nov 10;15(1):107.
doi: 10.1186/s13321-023-00778-w.

Exploring the known chemical space of the plant kingdom: insights into taxonomic patterns, knowledge gaps, and bioactive regions

Affiliations

Exploring the known chemical space of the plant kingdom: insights into taxonomic patterns, knowledge gaps, and bioactive regions

Daniel Domingo-Fernández et al. J Cheminform. .

Abstract

Plants are one of the primary sources of natural products for drug development. However, despite centuries of research, only a limited region of the phytochemical space has been studied. To understand the scope of what is explored versus unexplored in the phytochemical space, we begin by reconstructing the known chemical space of the plant kingdom, mapping the distribution of secondary metabolites, chemical classes, and plants traditionally used for medicinal purposes (i.e., medicinal plants) across various levels of the taxonomy. We identify hotspot taxonomic clades occupied by a large proportion of medicinal plants and characterized secondary metabolites, as well as clades requiring further characterization with regard to their chemical composition. In a complementary analysis, we build a chemotaxonomy which has a high level of concordance with the taxonomy at the genus level, highlighting the close relationship between chemical profiles and evolutionary relationships within the plant kingdom. Next, we delve into regions of the phytochemical space with known bioactivity that have been used in modern drug discovery. While we find that the vast majority of approved drugs from phytochemicals are derived from known medicinal plants, we also show that medicinal and non-medicinal plants do not occupy distinct regions of the known phytochemical landscape and their phytochemicals exhibit properties similar to bioactive compounds. Moreover, we also reveal that only a few thousand phytochemicals have been screened for bioactivity and that there are hundreds of known bioactive compounds present in both medicinal and non-medicinal plants, suggesting that non-medicinal plants also have potential therapeutic applications. Overall, these results support the hypothesis that there are many plants with medicinal properties awaiting discovery.

Keywords: Chemotaxonomy; Drug discovery; Natural products; Phytochemistry.

PubMed Disclaimer

Conflict of interest statement

All authors were employees of Enveda Biosciences Inc. during the course of this work and have real or potential ownership interest in the company.

Figures

Fig. 1
Fig. 1
A Overview of the size and specificity of the chemical space across plant families. The blue column of the heatmap displays the normalized number of reported chemicals for each of the 513 families (i.e., leaf nodes in the phylogenetic tree). The red column represents the proportion of medicinal plants within the family. The green column highlights the proportion of phytochemicals that are unique to the family. Lastly, the orange column represents the average number of chemicals per species within the family. B Relative abundance of the 20 major secondary metabolite classes across plant families. Similar to (A) the leaf nodes in the phylogenetic tree correspond to different plant families. The heatmap indicates the relative abundance of each secondary metabolite class as a percentage with respect to the 567 chemical classes from NPClassifier [23]. Since the phylogenetic tree cannot be plotted with a heatmap with 567 columns (total number of chemical classes), we selected the 20 most abundant classes that were present in the majority of the plant families. Thus, only 319 of the 513 families which contained chemicals present in any of these 20 classes are depicted
Fig. 2
Fig. 2
A Heatmap of the chemical similarity across the 24 largest genera based on number of plants and chemical information. The genus of each species is colored on the x and y axes. Note that the matrix displays the distance between pairs of species based on their chemical similarity. Details on the hierarchical clustering used and the definition of chemical similarity used to define the distance between the plants are described in the methods section. B Heatmap of chemical similarity focusing on a random subset of the 24 genera
Fig. 3
Fig. 3
Distribution of the molecular weights (MW) (A), LogP (B), topological polar surface area (TPSA) (C), and fraction of sp3 hybridized carbon atoms (Fsp3) (D) of compounds in medicinal and non-medicinal plants. Overlap of compounds (E) and Murcko scaffolds (F) between medicinal and non-medicinal plants
Fig. 4
Fig. 4
A Overlap between ChEMBL compounds with bioassay data and known phytochemicals mapped to ChEMBL (19,137 out of 87,019). Bioassay data represents the set of chemicals in ChEMBL whose bioactivity (active or inactive) has been evaluated. B Number of bioactive and non-bioactive compounds (represented as ‘active’ and ‘inactive’, respectively) in medicinal and non-medicinal plants. C Overlap of all bioactive compounds derived from medicinal and non-medicinal plants based on their bioassay information in ChEMBL

References

    1. Atanasov AG, Zotchev SB, Dirsch VM, Supuran CT. Natural products in drug discovery: advances and opportunities. Nat Rev Drug Discovery. 2021;20(3):200–216. doi: 10.1038/s41573-020-00114-z. - DOI - PMC - PubMed
    1. Newman DJ, Cragg GM. Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019. J Nat Prod. 2020;83(3):770–803. doi: 10.1021/acs.jnatprod.9b01285. - DOI - PubMed
    1. Atanasov AG, Waltenberger B, Pferschy-Wenzig EM, Linder T, Wawrosch C, Uhrin P, et al. Discovery and resupply of pharmacologically active plant-derived natural products: a review. Biotechnol Adv. 2015;33(8):1582–1614. doi: 10.1016/j.biotechadv.2015.08.001. - DOI - PMC - PubMed
    1. Howes MJR, Quave CL, Collemare J, Tatsis EC, Twilley D, Lulekal E, et al. Molecules from nature: reconciling biodiversity conservation and global healthcare imperatives for sustainable use of medicinal plants and fungi. Plants People Planet. 2020;2(5):463–481. doi: 10.1002/ppp3.10138. - DOI
    1. Ncube B, Finnie JF, Van Staden J. Quality from the field: The impact of environmental factors as quality determinants in medicinal plants. S Afr J Bot. 2012;82:11–20. doi: 10.1016/j.sajb.2012.05.009. - DOI

LinkOut - more resources