Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun;72(6):1242-1260.
doi: 10.1111/evo.13487. Epub 2018 May 25.

The architecture of an empirical genotype-phenotype map

Affiliations

The architecture of an empirical genotype-phenotype map

José Aguilar-Rodríguez et al. Evolution. 2018 Jun.

Abstract

Recent advances in high-throughput technologies are bringing the study of empirical genotype-phenotype (GP) maps to the fore. Here, we use data from protein-binding microarrays to study an empirical GP map of transcription factor (TF) -binding preferences. In this map, each genotype is a DNA sequence. The phenotype of this DNA sequence is its ability to bind one or more TFs. We study this GP map using genotype networks, in which nodes represent genotypes with the same phenotype, and edges connect nodes if their genotypes differ by a single small mutation. We describe the structure and arrangement of genotype networks within the space of all possible binding sites for 525 TFs from three eukaryotic species encompassing three kingdoms of life (animal, plant, and fungi). We thus provide a high-resolution depiction of the architecture of an empirical GP map. Among a number of findings, we show that these genotype networks are "small-world" and assortative, and that they ubiquitously overlap and interface with one another. We also use polymorphism data from Arabidopsis thaliana to show how genotype network structure influences the evolution of TF-binding sites in vivo. We discuss our findings in the context of regulatory evolution.

Keywords: Transcription factors; evolvability; molecular evolution; mutations.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Intranetwork statistics for 190 TFs from M. musculus. The distributions of genotype network (A) diameter, (B) characteristic path length, (C) clustering coefficient, and (D) assortativity. (E) Assortativity (vertical axis) and its relationship to the number of genotypes in the dominant genotype network (horizontal axis). The horizontal dashed line indicates an uncorrelated (nonassortative) mixing pattern. (F) The distribution of the genotype network route factor.
Figure 2
Figure 2
The structural properties of genotype networks are indicative of binding site diversity in extant populations of A. thaliana. Shannon's diversity of a TF's polymorphic binding sites is shown in relation to (A) the number of nodes, (B) characteristic path length, and (C) route factor of its genotype network. The label of the y‐axis applies to all panels.
Figure 3
Figure 3
Matrices of internetwork relationships for the genotype networks of TF binding sites from M. musculus. Heatmaps of log10‐transformed (A) overlap and (B) φqp, the probability of mutating from the genotype network of phenotype p to the genotype network of phenotype q. The rows and columns are grouped according to binding domain, which are ordered alphabetically on the horizontal axis: A, AP‐2; B, ARID/BRIGHT; C, AT hook; D, bHLH; E, bZIP; F, C2H2 ZF; G, CxxC; H, E2F; I, Ets; J, Forkhead; K, GATA; L, GCM; M, Homeodomain; N, Homeodomain + POU; O, IRF; P, MADS box; Q, Myb/SANT; R, Ndt80/PhoG; S, Nuclear receptor; T, RFX; U, SAND; V, SMAD; W, Sox; X, T‐Box; Y: TBP. Within the DNA‐binding domain groups, the rows and columns are ordered by the size of each TF's dominant genotype network, such that network size increases from top to bottom and from left to right. Labels on the vertical axis indicate the name of the TFs, which can be read on the computer by zooming in. Cells colored in gray indicate either N/A values (on the diagonal) or values equal to zero (off‐diagonal).
Figure 4
Figure 4
Phenotype space covering. (A) The proportion of phenotypes covered as a function of the mutational radius n from a given binding site, averaged across all binding sites of the murine TF Sp110. The maximum proportion of phenotypes covered plateaus at a much lower level when considering just neutral mutations than when considering non‐neutral mutations. Error bars are the standard deviations of the mean. (B) The maximum proportion of phenotypes covered by neutral mutations as a function of the number of binding sites in the dominant genotype network, for all 190 murine TFs. The black line shows the fitted linear regression to the data (R2=0.516) and the shaded gray area denotes 95% confidence intervals. The figure also shows the Spearman's correlation and its associated P‐value.
Figure 5
Figure 5
Matrices of internetwork relationships for the genotype networks of binding domains from M. musculus. Heatmaps of log10‐transformed (A) overlap and (B) φqp, the probability of mutating from the genotype network of phenotype p to the genotype network of phenotype q. Each row and column represents a different genotype network. Domains are ordered alphabetically. Cells colored in gray indicate either N/A values (on the diagonal) or values equal to zero (off‐diagonal).

References

    1. 1001 Genomes Consortium . 2016. 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana . Cell 166:1–11. - PMC - PubMed
    1. Aguilar‐Rodríguez, J. , Payne J. L., and Wagner A.. 2017. A thousand adaptive landscapes and their navigability. Nat. Ecol. Evol. 1:0045. - PubMed
    1. Aguirre, J. , Buldú J., Stich M., and Manrubia S.. 2011. Topological structure of the space of phenotypes: the case of RNA neutral networks. PLOS ONE 6:e26324. - PMC - PubMed
    1. Ahnert, S. E. 2017. Structural properties of genotype‐phenotype maps. J. R. Soc. Interface 14:20170275. - PMC - PubMed
    1. Alberch, P. 1991. From genes to phenotype: dynamical systems and evolvability. Genetica 84:5–11. - PubMed

Publication types