Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 Aug 27;33(8):951-62.
doi: 10.1039/c6np00035e. Epub 2016 Jun 20.

Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways

Affiliations
Review

Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways

Marnix H Medema et al. Nat Prod Rep. .

Abstract

Covering: 2003 to 2016The last decade has seen the first major discoveries regarding the genomic basis of plant natural product biosynthetic pathways. Four key computationally driven strategies have been developed to identify such pathways, which make use of physical clustering, co-expression, evolutionary co-occurrence and epigenomic co-regulation of the genes involved in producing a plant natural product. Here, we discuss how these approaches can be used for the discovery of plant biosynthetic pathways encoded by both chromosomally clustered and non-clustered genes. Additionally, we will discuss opportunities to prioritize plant gene clusters for experimental characterization, and end with a forward-looking perspective on how synthetic biology technologies will allow effective functional reconstitution of candidate pathways using a variety of genetic systems.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1. Approaches for plant biosynthetic pathway discovery. Physical co-clustering, co-expression, evolutionary co-occurrence, and epigenetic co-regulation can all be used to identify candidate biosynthetic pathways. Using, for example, network analysis, these approaches can also be combined, if sufficient data is available. Functionally cohesive modules can then be extracted from such a network and annotated for the presence of genes encoding biosynthesis-related protein families. Finally, modules that have a strong biosynthetic signature can be correlated to metabolite counts or molecular families derived from molecular networking, of metabolite data.
Fig. 2
Fig. 2. Features and statistics of 28 known plant biosynthetic gene clusters. The graphs show the distributions of compound classes produced from known enzymes encoded in plant biosynthetic gene clusters (green), the number of unique (broad) enzyme families per gene cluster (red) and the gene counts of enzyme families across all clusters (blue). The numbers for the latter two are based on automated annotation of broad enzyme families through the Pfam database; it should therefore be noted that any two enzymes from one Pfam protein family can still catalyze two significantly different chemical reactions. In all specific cases where only two enzyme classes are present in a cluster according to the figure, one of these comprises multiple distinct subclasses of cytochrome P450s belonging to at least two different P450 subfamilies.
Fig. 3
Fig. 3. Some examples of plant metabolic gene clusters. Genes are indicated by arrows and gene(s) for the first committed pathway step are indicated in red. Gene names are indicated above the clusters and class of biosynthetic enzyme below. Abbreviations: OSC, oxidosqualene cyclase; IGPL, indole 3-glycerol phosphate lyase; AT (SCPL), SCPL-acyltransferase; AT (BAHD), BAHD-acyltransferase; MT, methyltransferase; UGT, UDP-dependent sugar transferase; DHO, dehydrogenase/reductase; CES, carboxylesterase; CYP, cytochrome P450. The oat avenacin cluster contains five genes for the synthesis, oxidation and acylation of the triterpene scaffold. Two other loci (Sad3 and Sad4) have been shown to be required for avenacin glucosylation but not yet cloned. Sad3 lies within 3.6 cM of the core cluster while Sad4 is unlinked., The maize DIMBOA pathway includes three genes that are not shown in the figure; Bx7, which is separated from the core cluster by an intervening region of 15 Mb; the sugar transferase gene Bx9, which is located on a different chromosome; finally a further gene Bx6 is not shown because its genomic location has not yet been established. The noscapine cluster from poppy contains all of the pathway genes except the gene for tetrahydroprotoberberine cis-N-methyltransferase (TNMT), which catalyses the first committed pathway step.
Fig. 4
Fig. 4. Co-expression techniques to identify biosynthetic pathway components. The simplest way to identify novel candidates for a pathway is to use a bait gene that is known to be involved in the pathway and to rank all other genes by correlation coefficient to the bait. In order to also visualize the interrelationships between all (relevant) genes, clustered heatmaps can be used. The same is true for coexpression networks, which have the added advantage that they can also be used in ‘untargeted’ approaches to identify candidate pathways by extracting modules out of the network without using a bait. Finally, cross-species co-expression networks can be used to identify orthologous groups of genes whose co-expression is conserved over longer evolutionary periods.
Fig. 5
Fig. 5. Synthetic biology approaches to characterize plant biosynthetic pathways. For identified (candidate) pathways, a construct is synthesized and assembled that contains all genes needed to produce the end product of the pathway, as well as the required regulatory elements. The construct is then expressed in either yeast or tobacco, after which the metabolite is identified and further characterized.
None
Marnix Medema
None
Anne Osbourn

References

    1. Geu-Flores F., Sherden N. H., Courdavault V., Burlat V., Glenn W. S., Wu C., Nims E., Cui Y., O'Connor S. E. Nature. 2012;492:138–142. - PubMed
    1. Lau W., Sattely E. S. Science. 2015;349:1224–1228. - PMC - PubMed
    1. Rajniak J., Barco B., Clay N. K., Sattely E. S. Nature. 2015;525:376–379. - PMC - PubMed
    1. Medema M. H., Fischbach M. A. Nat. Chem. Biol. 2015;11:639–648. - PMC - PubMed
    1. Nützmann H.-W., Osbourn A. Curr. Opin. Biotechnol. 2014;26:91–99. - PubMed

Substances

LinkOut - more resources