Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Sep 21;39(9):1876-1896.
doi: 10.1039/d2np00032f.

Integrative omics approaches for biosynthetic pathway discovery in plants

Affiliations
Review

Integrative omics approaches for biosynthetic pathway discovery in plants

Kumar Saurabh Singh et al. Nat Prod Rep. .

Abstract

Covering: up to 2022With the emergence of large amounts of omics data, computational approaches for the identification of plant natural product biosynthetic pathways and their genetic regulation have become increasingly important. While genomes provide clues regarding functional associations between genes based on gene clustering, metabolome mining provides a foundational technology to chart natural product structural diversity in plants, and transcriptomics has been successfully used to identify new members of their biosynthetic pathways based on coexpression. Thus far, most approaches utilizing transcriptomics and metabolomics have been targeted towards specific pathways and use one type of omics data at a time. Recent technological advances now provide new opportunities for integration of multiple omics types and untargeted pathway discovery. Here, we review advances in plant biosynthetic pathway discovery using genomics, transcriptomics, and metabolomics, as well as recent efforts towards omics integration. We highlight how transcriptomics and metabolomics provide complementary information to link genes to metabolites, by associating temporal and spatial gene expression levels with metabolite abundance levels across samples, and by matching mass-spectral features to enzyme families. Furthermore, we suggest that elucidation of gene regulatory networks using time-series data may prove useful for efforts to unwire the complexities of biosynthetic pathway components based on regulatory interactions and events.

PubMed Disclaimer

Conflict of interest statement

J. J. J. v. d. H. is a member of the Scientific Advisory Board of NAICONS Srl., Milano, Italy. M. H. M. is a member of the Scientific Advisory Board of Hexagon Bio and co-founder of Design Pharmaceuticals.

Figures

Fig. 1
Fig. 1. Timeline of the identification of biosynthetic pathways in plants. The names of secondary metabolites and the associated species/genus are color coded based on the omics technology used in the identification process. An asterisk means the initial discovery of biosynthetic genes using genetics and/or biochemical-based approaches.
Fig. 2
Fig. 2. Overview of omics experiment designs to elucidate secondary metabolic pathways. Top: single and combination of omics design result in mapping individual genes, proteins or metabolites to a set of pathway components. Bottom: an integrative-omics approach combines knowledge from different layers of a biological system and can be used for generating an integrated knowledge network (IKN). The IKN enables the identification of hidden interactions between genomic features and unravels the regulation of genes across time points and different conditions. Integrative omics likely better predicts the different components of a biosynthetic pathway than single- or multi-omics. Dashed lines in the genetic architecture and pathway indicate missing/unknown components.
Fig. 3
Fig. 3. Different strategies for transcriptomics-based analysis. (A) Experimental design with different conditions (C), without time-points (top). Coexpression networks constitute a useful method to identify genes with similar expression patterns, which may belong to a biosynthetic pathway. (B) (1) Experimental design with different conditions (C) and time points (T) (bottom). (2) Differentially expressed genes in response to a treatment are partitioned into clusters based on their coexpression. Each row represents a single gene and its expression at different time points. (3) Enriched cis-regulatory motif in gene coexpression modules. (4) Expression pattern of a single regulator at different time points under different conditions. (5) Comparing degree of overlap between different treatments. (6) Simplified DREM model annotated with TFs. Each path corresponds to a set of coexpressed genes. Green nodes are the bifurcation points where coexpressed genes diverge in expression.
Fig. 4
Fig. 4. Time-based metabolomics data analysis. Molecular networks are generated using spectral data from all the MS2 samples by classical or feature-based molecular networking implemented in the GNPS. Additionally, spectral data are also subjected to the substructure discovery using MS2LDA and NAP. Metabolite annotation is further extended by in silico-metabolization method implemented in the MetWork pipeline. In addition, MetWork also proposes CFM-ID-predicted MS/MS spectra of the derivatized substrates. The time-series design is then applied to all the samples to check the distribution of differentially abundant metabolite (DAMs) across timepoints and across different conditions to better predict biosynthetic pathways.
Fig. 5
Fig. 5. Overview of data integration possibilities to predict biosynthetic pathways. The figure is inspired on the MicroTom metabolic network where coexpression networks are correlated with metabolites to generate a knowledge network of flavonoid biosynthesis genes. Other genomic data like ChIP- or DAP-seq data can also be integrated with the transcriptomic data to obtain a holistic view of gene regulation of a biosynthetic pathways. Later, metabolomics data can be added to generate an integrated knowledge network (IKN). Reaction databases can be mapped to the IKN to predict biosynthetic pathways.
None
Kumar Saurabh Singh
None
Justin J. J. van der Hooft
None
Saskia C. M. van Wees
None
Marnix H. Medema

Similar articles

Cited by

References

    1. Erb M. Kliebenstein D. J. Plant Physiol. 2020;184:39–52. doi: 10.1104/pp.20.00433. - DOI - PMC - PubMed
    1. Polturak G. Osbourn A. PLoS Pathog. 2021;17:1009698. doi: 10.1371/journal.ppat.1009698. - DOI - PMC - PubMed
    1. Rai A. Saito K. Yamazaki M. Plant J. 2017;90:764–787. doi: 10.1111/tpj.13485. - DOI - PubMed
    1. de Bernonville T. D. Papon N. Clastre M. O'Connor S. E. Courdavault V. Trends Pharmacol. Sci. 2020;41:142–146. doi: 10.1016/j.tips.2019.12.006. - DOI - PubMed
    1. Owen C. Patron N. J. Huang A. Osbourn A. Curr. Opin. Chem. Biol. 2017;40:24–30. doi: 10.1016/j.cbpa.2017.04.015. - DOI - PMC - PubMed

Publication types

Substances