Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Mar 27;104(13):5516-20.
doi: 10.1073/pnas.0609023104. Epub 2007 Mar 19.

Hierarchy and feedback in the evolution of the Escherichia coli transcription network

Affiliations

Hierarchy and feedback in the evolution of the Escherichia coli transcription network

M Cosentino Lagomarsino et al. Proc Natl Acad Sci U S A. .

Abstract

The Escherichia coli transcription network has an essentially feedforward structure, with abundant feedback at the level of self-regulations. Here, we investigate how these properties emerged during evolution. An assessment of the role of gene duplication based on protein domain architecture shows that (i) transcriptional autoregulators have mostly arisen through duplication, whereas (ii) the expected feedback loops stemming from their initial cross-regulation are strongly selected against. This requires a divergent coevolution of the transcription factor DNA-binding sites and their respective DNA cis-regulatory regions. Moreover, we find that the network tends to grow by expansion of the existing hierarchical layers of computation, rather than by addition of new layers. We also argue that rewiring of regulatory links due to mutation/selection of novel transcription factor/DNA binding interactions appears not to significantly affect the network global hierarchy, and that horizontally transferred genes are mainly added at the bottom, as new target nodes. These findings highlight the important evolutionary roles of both duplication and selective deletion of cross-talks between autoregulators in the emergence of the hierarchical transcription network of E. coli.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Feedback and hierarchy in the E. coli transcription network. (a) Scheme of the layer structure of the network. Direction of regulatory links is from top to bottom. Each line represents a layer, populated by TFs (blue, thick line) and TGs (black, thin line). Members of layer i are regulated at most by i − 1 nodes plus themselves. By definition, layer one is constituted entirely by TFs. Annotations on the right side of the layers specify their population of TGs, TFs, and ARs. (b) Evaluation of feedback with the leaf-removal algorithm. (Right) Illustration of the leaf-removal algorithm. Leaves are nodes that do not regulate any other node. Removal of one leaf and its regulatory links may create a new leaf. Iterative removal of leaves has to stop at a core of nodes that contains loops (blue, circled nodes, dashed links). The core might contain tree-like components upstream of the loops (black). (Left) Histogram of the number of nodes in the core NC for randomized counterparts of E. coli (16). The data refer to 1.1 × 106 accepted MCMC moves for randomization (see Methods and Note 1 in SI Appendix). (c) Histogram of the layer number in the randomized counterparts of the E. coli network. The average number of observed layers is ≈12, to compare to the 5 of E. coli. The data correspond to a MCMC run where a total of 5.78 × 108 matrices were generated (of which ≈1.23 × 108 were tree-like). (d) The flagella-building subnetwork is the only example of functional subnetwork that spans all of the five layers. Here, this subnetwork is constructed arbitrarily starting from a member of layer one and following the tree downstream.
Fig. 2.
Fig. 2.
Evaluation of different evolutionary drives (see also SI Table 1 in SI Appendix). (a) Duplicates of ARs tend to retain their self-links. This is quantified globally by the observables hAR, the average fraction of ARs in classes with two or more ARs, and gAR, measuring the spread in the AR population among classes that can be observed in Fig. 3a and SI Fig. 7 in SI Appendix. (b) Duplication and divergence preserve the layer structure. The first column indicates distance between layers (defined as the absolute difference in layer numbers), whereas the second and the third correspond to the population of duplicate genes (genes in the same homology class) at that distance, in 105 instances with randomized domain associations (average values) and the E. coli domain association data set respectively. For example, the first row (pairs of genes at distance zero) concerns the number of duplicate genes which occupy the same layer (see Fig. 3b and Note 2 in SI Appendix). The sketch in the right panel illustrates the distribution of nodes belonging to the same class of TFs (cyan) or TGs (yellow) among the layers, and the definition of distance between layers. (c) Fate of gene gains from horizontal transfer. TFs are underrepresented both in the class of gene gains (columns 2 and 3) and in the class of gene gains that have at least a paralog in the homology classes constructed with domain associations (columns 5 and 6).
Fig. 3.
Fig. 3.
Duplication of ARs in the E. coli transcription network. (a) ARs are propagated by duplication of the network (see also Fig. 2a) and need to develop specificity by coevolution. (Upper) The mechanism for duplication. A is an AR. In an initial stage, the original, A, and its copy, A′, are identical. This creates a circuit where both A and A′ are ARs, and there is mutual cross-talk (light blue) links. Subsequent divergence can erase the links (see Note 3 in SI Appendix). (Lower Left) Population of ARs in the homology classes in the E. coli network with original vs. randomized domain associations. The x axis reports the size of each class of transcription factors, whereas the y axis indicates the fraction of autoregulators in the class. The dashed line corresponds to the expected value computed from the total fraction of ARs. Red dots are randomized instances. (Lower Right) Histogram of the AR population (number of ARs in the class) of the largest homology class (having 15 members) for 105 randomizations of the superfamily structural domains of the TFs, compared with the observed quantity in E. coli (diamond). In most (95%) of the randomizations, the class with 15 members contains <11 ARs, indicating that duplication is likely. (b) Layers tend to be populated by members of the same homology class. Comparison with randomizations of the structural domain associations of all of the genes. The x axis reports the total number of gene pairs of the same homology class belonging to the same layer. The histogram represents the randomized case, whereas the diamond indicates the observed value in E. coli.

References

    1. Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA. Curr Opin Struct Biol. 2004;14:283–291. - PubMed
    1. Shen-Orr SS, Milo R, Mangan S, Alon U. Nat Genet. 2002;31:64–68. - PubMed
    1. Salgado H, Santos-Zavaleta A, Gama-Castro S, Peralta-Gil M, Penaloza-Spinola MI, Martinez-Antonio A, Karp PD, Collado-Vides J. BMC Bioinformatics. 2006;7:5. - PMC - PubMed
    1. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, et al. Science. 2002;298:799–804. - PubMed
    1. Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, Hannett NM, Tagne JB, Reynolds DB, Yoo J, et al. Nature. 2004;431:99–104. - PMC - PubMed