Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Apr;287(7):1262-1283.
doi: 10.1111/febs.15299.

Evolution of new enzymes by gene duplication and divergence

Affiliations
Review

Evolution of new enzymes by gene duplication and divergence

Shelley D Copley. FEBS J. 2020 Apr.

Abstract

Thousands of new metabolic and regulatory enzymes have evolved by gene duplication and divergence since the dawn of life. New enzyme activities often originate from promiscuous secondary activities that have become important for fitness due to a change in the environment or a mutation. Mutations that make a promiscuous activity physiologically relevant can occur in the gene encoding the promiscuous enzyme itself, but can also occur elsewhere, resulting in increased expression of the enzyme or decreased competition between the native and novel substrates for the active site. If a newly useful activity is inefficient, gene duplication/amplification will set the stage for divergence of a new enzyme. Even a few mutations can increase the efficiency of a new activity by orders of magnitude. As efficiency increases, amplified gene arrays will shrink to provide two alleles, one encoding the original enzyme and one encoding the new enzyme. Ultimately, genomic rearrangements eliminate co-amplified genes and move newly evolved paralogs to a distant region of the genome.

Keywords: Innovation-amplification-divergence model; directed evolution; enzyme evolution; gene duplication; promiscuity.

PubMed Disclaimer

Conflict of interest statement

Conflicts of interest

None.

Figures

Figure 1.
Figure 1.
Two models for the evolution of new genes. In the Ohno model, duplication occurs before neofunctionalization. In the IAD model, neofunctionalization occurs before gene duplication. After neofunctionalization in either model, selection for increased gene dosage can lead to further amplification. Only duplications are shown for simplicity.
Figure 2.
Figure 2.
Estimated rates of processes affecting a redundant gene copy. Acquisition of a new function is orders of magnitude less likely than loss of function via deletion, drift, mutations or gene conversion. Reprinted with permission from Proc Natl Acad Sci USA 104(43):17004–9. Copyright 2007 National Academy of Sciences.
Figure 3.
Figure 3.
Number of redundant enzyme (grey) and transcription factor (red) sequences (i.e. paralogs) in 794 bacterial and archaeal genomes. Paralogs are defined as sequences with ≥ 30% sequence identity over ≥ 60% of the sequence, with an E-value of < 10e−5. Reprinted from PLoS One 8(7):e69707, 2013.
Figure 4.
Figure 4.
The human protein kinome. AGC, contains PKA, PKG, and PKC families; CAMK, calcium/calmodulin-dependent protein kinase; CK1, casein kinase 1; CMGC, contains CDK, MAPK, GSK3, and CLK families; STE, homologs of yeast Sterile 7, Sterile 11, Sterile 20 kinases; TK, tyrosine kinases; TKL, tyrosine-kinase-like. Reprinted from [91].
Figure 5.
Figure 5.
ProA and ArgC catalyze reduction of an acyl phosphate to an aldehyde in the pathways for synthesis of proline and arginine, respectively. ProA has an inefficient promiscuous activity with N-acetylglutamyl phosphate.
Figure 6.
Figure 6.
A novel pathway patched together from promiscuous enzyme activities restores synthesis of PLP in ΔpdxB E. coli by bypassing the block in the pathway. Promiscuous enzymes that normally serve other functions are highlighted in red. SerA, 3-phosphoglycerate dehydrogenase; SerC, phosphoserine/phosphohydroxythreonine aminotransferase; ThrB, homoserine kinase.
Figure 7.
Figure 7.
DNA sequences involved in recombination during duplication of regions surrounding the ben and cat genes in A. baylyi ADP1 observed after selection for growth on benzoate (Ben+) or anthraniliate (Ant+). The top line in B is the parental downstream sequence, and the bottom line is the parental upstream sequence. The middle line is the sequence formed at the junction. Identical nucleotides are underlined. Part A reprinted with permission from Mol Microbiol. 83(3):520–35, 2012.
Figure 8.
Figure 8.
Unequal crossing-over between homologous regions of duplicated segments during genome replication gives rise to daughter cells with either more or fewer copies of the duplicated segment.
Figure 9.
Figure 9.
Fold increase in transcription for duplicated genes in 17 mutation-accumulation lines of C. elegans relative to lines in which the genes were not duplicated. FPKM, fragments per kilobase of exon model per million mapped reads. Reprinted with permission from Proc Natl Acad Sci USA 115(28):7386–91, 2018.
Figure 10.
Figure 10.
Correlations between mRNA and protein levels and gene copy number for 52 proteins in 251 breast cancer cell lines. Proteins are divided into groups based on patterns of correlations. Only Group A proteins show significant correlations between copy number and protein levels. Red arrows indicate proteins for which correlations are improved after reducing the effect of samples with little variability in copy number. CN, copy number; GX, gene expression; PX, protein expression. Reprinted from Mol Oncol. 7(3):704–18, 2013. Influence of DNA copy number and mRNA levels on the expression of breast cancer related proteins. Myhre S, Lingjaerde OC, Hennessy BT, Aure MR, Carey MS, Alsner J, et al.. Published under a Creative Commons Attribution (CC BY) License.
Figure 11.
Figure 11.
The reactions catalyzed by melamine deaminase (TriA) and atrazine chlorohydrolase (AtzA). AtzA may have evolved from TriA, which has a weak promiscuous activity with atrazine.
Figure 12.
Figure 12.
Reactions catalyzed by dihydrocoumarin hydrolase and methyl parathion hydrolase. The reconstructed ancestor of extant dihydrocoumarin hydrolases and methyl parathion hydrolases is an efficient dihydrocoumarin hydrolase with an inefficient promiscuous methylparathion hydrolase activity.
Figure 13.
Figure 13.
Paraoxon docked into the active site of wild-type AiiA (left) and a mutant enzyme with six mutations, three of which (S20F, V69G and F64C) reshape the active site. The position of the nucleophilic water and the two metal ions to which it is coordinated (gold spheres) are unchanged. In the wild-type enzyme, the substrate is not positioned correctly for in-line attack or water. Repositioning of Phe68 orients the substrate more appropriately. Reprinted with permission from Biochemistry 55(32):4583–93, 2106. Copyright 2016 American Chemical Society.
Figure 14.
Figure 14.
A) Reactions catalyzed by members of the enolase superfamily. B) Active sites of enzymes in the enolase superfamily. The three residues that coordinate the active site Mg++ are highlighted in cyan. Catalytic residues are highlighted in green. Molecular graphics analysis was performed with UCSF Chimera v. 1.13, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from NIH P41-GM103311 [92].
Figure 15.
Figure 15.
A) HisA and TrpF catalyze Amadori rearrangements of structurally different substrates. B) The ProFAR substrate in the active site of the catalytically inactive D7N D176A HisA (PDB 5A5W). C) A TrpF product analog, rCdRP (reduced 1′-(2′-carboxyphenylamino)-1′-deoxyribulose 5′-phosphate), positioned in the active site of HisA(D7N/dup13–15/D10G) based upon its position in the active site of the ortholog PriA (PDB 2Y85). Reprinted with permission from Proc Natl Acad Sci USA 114(18):4727–32, 2017. Structural and functional innovations in the real-time evolution of new (betaalpha)8 barrel enzymes. Newton MS, Guo X, Soderholm A, Nasvall J, Lundstrom P, Andersson DI, et al.
Figure 16.
Figure 16.
Sites in the P. aeruginosa aliphatic amidase AmiE at which mutations improved growth of E. coli in the presence of isobutyramide as a sole nitrogen source. The active site is marked by a space-filled ligand covalently attached to the active site Cys166. Colors indicate the number of substitutions that were found to be favorable at each position in the protein. Reproduced from Nat Commun. 8:15695, 2017 under a Creative Commons license (http://creativecommons.org/licenses/by/4.0/).
Figure 17.
Figure 17.
The adaptive landscape between methyl parathion hydrolase (right) and an ancestor in which five key residues had been reverted to the ancestral state (left). The numbers under each node indicate the absence (0) or presence (1) of the derived residue at positions 73, 193, 258, 271 and 273. Colors indicate the mean methyl parathion hydrolase activity in lysates for three biological replicates. Dashed light grey lines indicate paths that are evolutionarily inaccessible because they go through an intermediate with decreased activity. Grey arrows indicate steps in which activity is increased. Reprinted by permission from Springer Nature: Nat Chem Biol. 15(11):1120–8, 2019. Higher-order epistasis shapes the fitness landscape of a xenobiotic-degrading enzyme, Yang G, Anderson DW, Baier F, Dohmen E, Hong N, Carr PD, et al. Copyright 2019.
Figure 18.
Figure 18.
Tradeoffs between the original and new activities of an evolving enzyme. A) A strong tradeoff between TrpF (yellow) and HisA (blue) activities in HisA. An allele encoding HisA(dup13–15/D10G) evolved into a specialist TrpF in S. enterica in which this bifunctional enzyme supported synthesis of both histidine and tryptophan, but poorly. Reprinted with permission from Proc Natl Acad Sci USA 114(18):4727–32, 2017. Newton MS, Guo X, Soderholm A, Nasvall J, Lundstrom P, Andersson DI, et al. Structural and functional innovations in the real-time evolution of new (betaalpha)8 barrel enzymes. B) A weak tradeoff between the original homoserine lactonase activity and the promiscuous paraoxonase activity of AiiA through six rounds of directed evolution. Reprinted with permission from Biochemistry 55:4583–4593, 2016. Yang G, Hong N, Baier F, Jackson CJ, Tokuriki N. Conformational tinkering drives evolution of a promiscuous activity through indirect mutational effects. Copyright 2016 American Chemical Society.
Figure 19.
Figure 19.
Divergent regulation of genes encoding E. coli maltodextrin phosphorylase (malP) and glycogen phosphorylase (glpG), paralogs with 48% sequence identity. glpG is transcribed from two different promoters. Green and red boxes indicate proteins that activate and repress transcription, respectively. The locations of the binding sites for the glycogen phosphorylase promoters are not known. Diagrams from Ecocyc Version 23.5 [93].
Figure 20.
Figure 20.
A gene encoding ProA* (E383A ProA), which has weak ArgC activity, amplifies to 6 copies in ΔargC proA* E. coli prior to a mutation that changes Phe372 to Leu. Subsequently, proA**, which encodes E383A F372L ProA, deamplifies to three copies. Reprinted from eLife 8:e53535, 2019: Mutations that improve efficiency of a weak-link enzyme are rare compared to adaptive mutations elsewhere in the genome. Morgenthaler AB, Kinney WR, Ebmeier CC, Walsh CM, Snyder DJ, Cooper VS, et al.
Figure 21.
Figure 21.
Remodeling within a segmental duplication removes some extraneous DNA. The yellow box indicates the gene under selection.
Figure 22.
Figure 22.
A duplication block in the 2p11 region of human chromosome 2. Each segment is labelled according the location of the presumed ancestral copy. The upper diagram shows computationally predicted ancestral loci of segments in the duplication block, and the lower diagram shows segments whose ancestral loci have been determined experimentally. (Some of the shortest segments have not been experimentally addressed.) Most of the duplicated segments come from other chromosomes. Reprinted by permission from Springer Nature: Nature Genetics, 39(11):1361–8. Ancestral reconstruction of segmental duplications reveals punctuated cores of human genome evolution, Z. Jiang et al. Copyright 2007.

References

    1. Kannan L, Li H, Rubinstein B & Mushegian A (2013) Models of gene gain and gene loss for probabilistic reconstruction of gene content in the last universal common ancestor of life, Biol Direct. 8, 32. - PMC - PubMed
    1. Zhou Y, Minio A, Massonnet M, Solares E, Lv Y, Beridze T, Cantu D & Gaut BS (2019) The population genetics of structural variants in grapevine domestication, Nat Plants. 5, 965–979. - PubMed
    1. Ohno S (1970) Evolution by gene duplication., Springer-Verlag, New York.
    1. Bergthorsson U, Andersson DI & Roth JR (2007) Ohno’s dilemma: evolution of new genes under continuous selection, Proc Natl Acad Sci U S A. 104, 17004–9. - PMC - PubMed
    1. Wolfe KH & Shields DC (1997) Molecular evidence for an ancient duplication of the entire yeast genome, Nature. 387, 708–13. - PubMed

Publication types