Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb 25;111(8):3044-9.
doi: 10.1073/pnas.1318797111. Epub 2014 Feb 10.

The predictability of molecular evolution during functional innovation

Affiliations

The predictability of molecular evolution during functional innovation

Diana Blank et al. Proc Natl Acad Sci U S A. .

Abstract

Determining the molecular changes that give rise to functional innovations is a major unresolved problem in biology. The paucity of examples has served as a significant hindrance in furthering our understanding of this process. Here we used experimental evolution with the bacterium Escherichia coli to quantify the molecular changes underlying functional innovation in 68 independent instances ranging over 22 different metabolic functions. Using whole-genome sequencing, we show that the relative contribution of regulatory and structural mutations depends on the cellular context of the metabolic function. In addition, we find that regulatory mutations affect genes that act in pathways relevant to the novel function, whereas structural mutations affect genes that act in unrelated pathways. Finally, we use population genetic modeling to show that the relative contributions of regulatory and structural mutations during functional innovation may be affected by population size. These results provide a predictive framework for the molecular basis of evolutionary innovation, which is essential for anticipating future evolutionary trajectories in the face of rapid environmental change.

Keywords: adaptation; biosynthesis; compensatory mutation; transcription.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Sixty-eight out of 435 populations evolved the ability to compensate for the function of the deleted gene. (A) Growth recovery was not deterministic. For some deletion genotypes, five out of five replicates recovered growth; for the majority, between one and four replicates recovered growth. Three hundred sixty-seven populations went extinct during the evolutionary process; these are not shown. (B) Novel functions that were related to building block biosynthesis were more difficult to evolve. The white bars indicate those deletion genotypes in which novel functionality evolved; in gray are those for which no novel function was evolved. For all categories except building block biosynthesis, novel functions evolved that compensated for the majority of deleted functions.
Fig. 2.
Fig. 2.
Growth rates of recovered clones were more similar for lineages derived from the same deletion genotypes. Each point shows the mean estimated doublings per hour for each clone (±SEM); clones are grouped by genotype (x axis) and colored gray and white to emphasize the groupings. The black solid line indicates the growth rate of the ancestral BW25113 strain. Six recovered clones did not exhibit detectable growth as assayed by OD600 and are indicated as having growth rates of 0. For four deletion genotypes, no clones exhibit detectable growth as assayed by OD600; these deletion genotypes are not shown here. The number below each genotype indicates the probability of observing a set of clones with growth rates at least as clustered as those that we observe (SI Appendix). Cases in which only one lineage recovered for a genotype are indicated with NA, as no clustering probability could be calculated. Each point is based on three biological replicates, except for 11 cases in which one replicate was excluded due to no growth being observed (SI Appendix).
Fig. 3.
Fig. 3.
Mutational events that occurred during the evolution of novel functionality affected both protein structure and expression level. (A) Mutations classified by type. The overwhelming majority of changes were point mutations (Left), followed by insertions sequence (IS) element-mediated changes, small indels, large amplifications (larger than 200 bp), and large deletions (larger than 200 bp). (B) Mutations classified by functional effect (SI Appendix). We inferred that the majority of changes affected protein structure, although more than a fifth of these resulted in altered reading frames or the incorporation of premature stop codons. Almost 40% of changes were inferred to be regulatory, that is, as directly affecting protein expression (in contrast to indirect effects, which may occur via structural changes in transcription factors or other mechanisms). (C) Deletion genotypes in which few replicates recovered tended to contain clones with more mutations. Each box shows the numbers of mutations found within clones, classified by the number of replicate lineages that recovered (e.g., for three deletion genotypes, four replicate lineages recovered; the number of mutations in each of the 12 sequenced clones is shown). The boxplots indicate the median, first and third quartiles, and the extreme values within the category. (D) Mutations in evolved clones often increased predicted transcriptional output due to changes in σ70 binding. For each unique intergenic mutation, we predicted the transcriptional output for the ancestral sequence and the evolved sequence (SI Appendix, Materials and Methods). The dotted black line indicates unchanged transcriptional output. The annotated black points are the promoters shown in E. (E) Random mutations that result in increased transcriptional output are rare. We predicted transcription (σ70 binding) for all point mutations and 1-bp indels in the promoter region surrounding the intergenic mutations plotted in D. Four examples are shown here. The predicted transcriptional output of the ancestor is shown as a red line; that of the same promoter region with the evolved mutation is shown as a green line. Most random mutations have little effect on transcription; however, in several of the evolved clones, the observed mutation was among those mutations with the largest possible predicted effect on transcription (SI Appendix, Fig. S4). Clockwise from the top left, the deletion genotypes and recruited genes are ∆argC and proB; ∆glyA and cycA; ∆pabA and pabB; and ∆ptsI and glk. The numbers in the top left of each panel indicate the fraction of all one-mutant neighboring promoters that have a predicted transcriptional output that is equal to or lower than the observed mutant. Note that both the x and y axes are on a log scale. (F) Mutations that affect translation both increase and decrease the predicted translation initiation rate. Translation initiation rates were predicted using a biophysical model (34) for the ancestral and derived alleles for all intergenic mutations, with the black dotted line indicating no change.
Fig. 4.
Fig. 4.
Intergenic mutations confer only moderate changes in protein expression. Mean fluorescence levels (±SEM) conferred by chromosomal copies of intergenic regions containing the ancestral (gray points) and evolved alleles (white points). Each pair of alleles is annotated with the mutational change that occurred, with the number indicating the position, in base pairs, from the first base pair of the downstream ORF. The x axis is annotated with the recruited genes whose promoters were affected by the mutation (first row) and the deletion genotype in which the mutation arose (second row). The ORFs of both metJ and metB are downstream of a single intergenic region (in opposite directions). Thus, the sequence contained in these constructs is identical, but GFP expression is driven by promoters on opposite strands. The arrows emphasize the direction of expression change. We predicted significant expression changes in carB, panD A-12G, avtA, and glnL of 12.4-, 2.7-, 2.4-, and 0.64-fold, respectively. For all other genotypes, we predicted no significant changes based on changes in σ70 binding or changes in ribosome binding. Note that the sensitivity of the assay means that very low expression levels (i.e., the avtA and glnL alleles) cannot be accurately measured. Thus, the fold change in expression, particularly for these strains, is likely larger than what we measured.
Fig. 5.
Fig. 5.
Mechanisms promoting novel functionality are dependent on cellular context. (A) The relative enrichment of regulatory and structural mutations is dependent on cellular function. Mutations that contribute toward novel functionality related to building block biosynthesis are more enriched for regulatory mutations, with 48% of all mutations being regulatory. In contrast, in other pathways, only 17–23% are regulatory. Green bars indicate regulatory mutations; gray bars indicate structural mutations. The numbers above the bars indicate the number of deletion genotypes within the category. (B) Regulatory mutations recruit proteins that act in functions related to the missing function. We calculated the shortest network distance between pairs of genes from high-confidence links in the STRING database (38). Green points indicate the network distances between the deleted gene and the genes recruited for functional compensation. Black points indicate the expected network distance between the set of deleted genes and a randomly selected recruited gene based on 5,000 randomizations of protein pairs. The last bin includes all gene pairs with a distance of nine or more, or which are not connected in the network. Genes recruited via regulatory mutations are on average more than three network links closer than expected by chance (Wilcoxon rank-sum test between observed and randomized network distances; n = 34; P = 2.5e-15). (C) Structural mutations recruit proteins that act in functions unrelated to the missing function. Gray points indicate the network distances between the deleted gene and the genes recruited for functional compensation. Black points indicate the expected network distance between the set of deleted genes and a randomly selected recruited gene based on 5,000 randomizations of protein pairs; genes recruited via structural change mutations are on average only 0.6 network links closer than expected by chance (Wilcoxon rank-sum test; n = 85; P = 4.0e-5).
Fig. 6.
Fig. 6.
Population genetic modeling (SI Appendix, Materials and Methods) shows that the relative numbers of regulatory (green points) and structural mutations (gray points) that contribute to novel function can depend on demographic parameters. (A) When the proportion of structural and regulatory mutations is 0.85 and 0.15, respectively (similar to the ratio of nonsynonymous to intergenic sites in the E. coli genome), and the distribution of selective effect sizes is identical, the ratio of the average number of structural and regulatory mutations within an individual is approximately independent of population size (white points; Lower). (B) If the number of structural sites at which structural mutations are beneficial is halved, the ratio again remains independent of population size, but the fraction of regulatory mutations approximately doubles. (C) If the mean and variance of the effects of structural mutations on fitness are half that of regulatory mutations, the ratio is dependent on population size, with individuals in larger populations containing larger relative numbers of regulatory mutations. (Insets) The shape of the distribution of mutational effects for structural (black) and regulatory (green) mutations. In A, the two distributions are identical. The results shown here correspond to 150 generations of evolution. All points shown are the means of at least 50 independent simulations.

References

    1. Jones FC, et al. Broad Institute Genome Sequencing Platform & Whole Genome Assembly Team The genomic basis of adaptive evolution in threespine sticklebacks. Nature. 2012;484(7392):55–61. - PMC - PubMed
    1. Zhang JZ, Zhang YP, Rosenberg HF. Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf-eating monkey. Nat Genet. 2002;30(4):411–415. - PubMed
    1. Ando H, Miyoshi-Akiyama T, Watanabe S, Kirikae T. A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis. Mol Micro. 2014 10.1111/mmi.12476. - PubMed
    1. Lieberman TD, et al. Parallel bacterial evolution within multiple patients identifies candidate pathogenicity genes. Nat Genet. 2011;43(12):1275–1280. - PMC - PubMed
    1. Wray GA. The evolutionary significance of cis-regulatory mutations. Nat Rev Genet. 2007;8(3):206–216. - PubMed

Publication types

Substances