Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Oct;4(10):e1000202.
doi: 10.1371/journal.pcbi.1000202. Epub 2008 Oct 24.

An end to endless forms: epistasis, phenotype distribution bias, and nonuniform evolution

Affiliations

An end to endless forms: epistasis, phenotype distribution bias, and nonuniform evolution

Elhanan Borenstein et al. PLoS Comput Biol. 2008 Oct.

Abstract

Studies of the evolution of development characterize the way in which gene regulatory dynamics during ontogeny constructs and channels phenotypic variation. These studies have identified a number of evolutionary regularities: (1) phenotypes occupy only a small subspace of possible phenotypes, (2) the influence of mutation is not uniform and is often canalized, and (3) a great deal of morphological variation evolved early in the history of multicellular life. An important implication of these studies is that diversity is largely the outcome of the evolution of gene regulation rather than the emergence of new, structural genes. Using a simple model that considers a generic property of developmental maps-the interaction between multiple genetic elements and the nonlinearity of gene interaction in shaping phenotypic traits-we are able to recover many of these empirical regularities. We show that visible phenotypes represent only a small fraction of possibilities. Epistasis ensures that phenotypes are highly clustered in morphospace and that the most frequent phenotypes are the most similar. We perform phylogenetic analyses on an evolving, developmental model and find that species become more alike through time, whereas higher-level grades have a tendency to diverge. Ancestral phenotypes, produced by early developmental programs with a low level of gene interaction, are found to span a significantly greater volume of the total phenotypic space than derived taxa. We suggest that early and late evolution have a different character that we classify into micro- and macroevolutionary configurations. These findings complement the view of development as a key component in the production of endless forms and highlight the crucial role of development in constraining biotic diversity and evolutionary trajectories.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. An illustration of the developmental model.
The r transcription factors bind to the promoters of k structural genes with affinities given by Dij. If the net activation to a promoter exceeds a threshold value (illustrated as a step function) the gene is expressed. The phenotype is described by the distribution of gene expression. This regulatory architecture corresponds to the single layered plan - See also our analysis of a generalized, multilayered, model.
Figure 2
Figure 2. Potential and visible phenotypes as a function of the regulatory dimension, r.
The phenotypic dimension is set to k = 18. All curves represent the average of 1,000 different developmental matrices. (A) The number of potential phenotypes (2r) and the number of distinct visible phenotypes as a function of the regulatory dimension. (B) The percentage of visible phenotypes out of the potential phenotypes, corresponding to a sigmoidal function. (C) The marginal contribution of each genetic element to the increase in the number of visible phenotypes. Formally, if V(r) denotes the number of visible phenotypes as a function of r, then the marginal contribution is defined as V(r)/V(r−1), and is evidently linear (with slope of −0.044; least squares regression).
Figure 3
Figure 3. Localization of the visible phenotypic subspace.
(A) A loglog plot of the distribution of degeneracy levels among visible phenotypes. Each point denotes the expected number of distinct phenotypes with a certain degeneracy level for a given developmental plan and is an average over 10,000 different plans. Note that the point associated with degeneracy level 0 (i.e., hidden phenotypes) is not included. These developmental plans frequently give rise to phenotypes with degeneracy levels higher than 103, and in rare cases, higher than 103.5. Given that the total number of genotypes is 214 a single phenotype can be produced by 6%–20% genotypes. (B) A contour plot of the gain function induced by a given developmental plan (all developmental plans produce qualitatively similar results). The gain function, gain(dg,dp), denotes the probability that the Hamming distance between two phenotypes is dp, given that the distance between the two genotypes that produced them is dg. (C) The distribution of pairwise phenotypic Hamming distances among randomly selected phenotypes (not produced by a developmental plan), distinct visible phenotypes (considering every visible phenotype only once, regardless of frequency), and visible phenotypes including all occurrences of each phenotype. The pairwise Hamming distances between randomly selected phenotypes follows a binomial distribution, with mean distance 7 (for phenotypes of length 14). Distinct visible phenotypes are closer to one another, with the mean distance 5.976. When weighting by the frequency of the visible phenotypes, the distance is reduced, with a mean distance 4.607.
Figure 4
Figure 4. The average distance between the the most frequent phenotypes and the patchiness of the visible phenotypic subspace.
(A) The average Hamming distance among visible phenotypes as a function of their frequency (dots). Visible phenotypes are ranked according to their frequency level. For each rank, we calculate the average Hamming distance between all visible phenotypes with this or higher rank. The most abundant phenotypes are very similar. This similarity decreases as less frequent phenotypes are included in the analysis. We also calculate which fraction of all visible phenotypes are included in these phenotypes (solid line). The inset shows a zoom of the same plot, focusing only on the top 5% most frequent phenotypes. The phenotypes that are included in this small fraction of the distinct visible phenotypes, are, on average, only 4 bits different, and still cover 50% of the phenotypes. (B) The one mutant neighbor network of the visible phenotypes. The size of the node is proportional to the logarithm of its frequency. In this plot, r = k = 12.
Figure 5
Figure 5. Pr(pj = 1), as a function of sj, the number of +1 elements in .
The total number of elements in formula image, r = 18.
Figure 6
Figure 6. The effect of multilayered developmental plans.
(A) The percentage of visible phenotypes out of the potential phenotypes as a function of the number of regulatory layers. The regulatory dimension, r, and the phenotypic dimension, k, are both set to 14. For a single regulatory layer, the visible phenotypes already constitute only 8.2% of the 214 potential phenotypes, in accordance with our results for the basic model. Introducing additional recurrent layers dramatically decreases the number of visible phenotypes (note the logarithmic scale), reaching 0.06% (approximately 10 phenotypes) with 50 layers. Furthermore, if each regulatory layer incorporates a different developmental plan, the reduction in the number of visible phenotypes as a function of the number of layers is even more extreme. (B) The distribution of the number of unique phenotypes that remain visible when the systems reaches steady state.
Figure 7
Figure 7. The effect of developmental plan density on phenotype distribution.
(A) The percentage of visible phenotypes out of the potential phenotypes as a function of the developmental plan density, c. The regulatory dimension, r, and the phenotypic dimension, k, are both set to 14. Each point represent the average of 1,000 different plans. For a given density value, c, each entry in the matrix is attributed with a nonzero value (either +1 or −1) with probability c. (B) The number of variable traits, ν (i.e., phenotypic elements that are active in at least one phenotype) as a function of the developmental plan density, c. The experimental settings are identical to those described in Figure 7A. (C) The percentage of visible phenotypes out of the 2ν achievable phenotypes as a function of the developmental plan density, c.
Figure 8
Figure 8. The distribution of pairwise phenotypic Hamming distances among randomly selected phenotypes (not produced by a developmental plan) and visible phenotypes (including all occurrences of each phenotype) produced by developmental plans with varying levels of density, c.
Each curve represents the average of 100 different plans. Due to computational constraints, the regulatory dimension, r, and the phenotypic dimension, k, are both set to 10.
Figure 9
Figure 9. Simulating the evolutionary process forward through time.
Similar colors denote shared regulatory wiring.
Figure 10
Figure 10. Phenotype distribution in an ontogenetic-phylogenetic model.
(A) The average pairwise Hamming distance between visible phenotypes within and between phyla. Each phylum corresponds to a developmental plan, and the set of the most frequent visible phenotypes produced by this plan represent species. The ancestral phyla is employing a developmental plan with r = 4 and k = 14. In each branching event, each of the two descendant phyla add an additional regulatory element with random connectivities preserving the ancestral component of the developmental plan (Figure 9). This branching process continues until we get the 1024 most recent phyla, each employing a developmental plan with r = 14 and k = 14. (B) A phylogenetic tree including phenotypes from derived and ancestral phyla. The tree is reconstructed by computing the pairwise Hamming distance matrix between all phenotypes and applying a neighbor-joining algorithms. Rectangular, triangular, and circular nodes represent phenotypes from the ancestral phylum, intermediate phyla, and derived phyla respectively. Phyla within each phylogenetic level are illustrated with different colors. The small tree on the bottom left corner illustrates the phylogenetic tree of different developmental plans (using the same color coding as that used in the main tree). Phenotypes (or ‘species’) of different phyla differ only in the developmental plan and not in genotype, but the resulting tree successfully clusters the members of each phyla. Furthermore, the members of intermediate phyla are correctly clustered, spanning the same phylogenetic space as their descendants. Members of the ancestral phylum (represented by black rectangles) span similar regions to those covered by all derived phenotypes. (C) Representation of ancestral, intermediate, and derived phenotypes according to the first two principle components. Ellipses illustrate the mean and variance for each phylum. The color coding is identical to that used in the phylogenetic tree.

References

    1. de Visser J, Hermisson J, Wagner G, Ancel Meyers L, Bagheri-Chaichian H, et al. Evolution and detection of genetic robustness. Evolution. 2003;57:1959–1972. - PubMed
    1. Borenstein E, Ruppin E. Direct evolution of genetic robustness in microrna. Proc Natl Acad Sci U S A. 2006;103:6593–6598. - PMC - PubMed
    1. Fontana W, Schuster P. Shaping space: the possible and the attainable in RNA genotype-phenotype mapping. J Theor Biol. 1998;194:491–515. - PubMed
    1. van Nimwegen E, Crutchfield JP, Huynen M. Neutral evolution of mutational robustness. Proc Natl Acad Sci U S A. 1999;96:9716–9720. - PMC - PubMed
    1. Huynen MA, Stadler PF, Fontana W. Smoothness within ruggedness: the role of neutrality in adaptation. Proc Natl Acad Sci U S A. 1996;93:397–401. - PMC - PubMed

Publication types