Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Oct 3;103(40):14724-31.
doi: 10.1073/pnas.0508637103. Epub 2006 Sep 26.

Genomic analysis of the hierarchical structure of regulatory networks

Affiliations

Genomic analysis of the hierarchical structure of regulatory networks

Haiyuan Yu et al. Proc Natl Acad Sci U S A. .

Abstract

A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and their target genes can be modeled in terms of directed regulatory networks. These relationships, in turn, can be readily compared with commonplace "chain-of-command" structures in social networks, which have characteristic hierarchical layouts. Here, we develop algorithms for identifying generalized hierarchies (allowing for various loop structures) and use these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae), with most TFs at the bottom levels and only a few master TFs on top. These masters are situated near the center of the protein-protein interaction network, a different type of network from the regulatory one, and they receive most of the input for the whole regulatory hierarchy through protein interactions. Moreover, they have maximal influence over other genes, in terms of affecting expression-level changes. Surprisingly, however, TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell. Finally, one might think master TFs achieve their wide influence through directly regulating many targets, but TFs with most direct targets are in the middle of the hierarchy. We find, in fact, that these midlevel TFs are "control bottlenecks" in the hierarchy, and this great degree of control for "middle managers" has parallels in efficient social structures in various corporate and governmental settings.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement: No conflicts declared.

Figures

Fig. 1.
Fig. 1.
Illustration of network motifs and the BFS-level method. (A) Four common network motifs in social networks. Different colors represent different motifs. All four schematics came from real social networks shown in Fig. 17, which is published as supporting information on the PNAS web site. (I) Single-input motifs (SIM). For example, node 1 is a professor or a director, and nodes 2 and 3 are his/her students or assistants, respectively. In the yeast regulatory network, node 1 is NDD1, and nodes 2 and 3 are STB5 and MCM21, whose only regulator is NDD1. (II) Multi-input motifs (MIM). Nodes 1 and 2 can be professors, and nodes 3 and 4 can be two students that they coadvise. In Fig. 17B, nodes 1 and 2 are Senior Director and Executive Director, and nodes 3 and 4 are different departments that they cosupervise. In the yeast regulatory network, nodes 1 and 2 are FKH1 and FKH2. Together, they regulate node 3 (DBF2) and node 4 (HDR1). (III) Feed-forward loop (FFL). For example, node 1 is the chairman of a department, node 2 is a professor in the department, and node 3 is a shared secretary. In yeast regulatory network, node 1 (MBP1) regulates node 2 (SWI4). Then, they collectively regulate node 3 (SPT21). (IV) Multicomponent loops (MCL). In Fig. 17D, node 1 is a chairman, node 2 is a director, node 3 is a coordinator, and node 4 is a scientist. Then some of the scientists form an advisory committee that oversees the chairman. In yeast regulatory network, node 1 is REB1, node 2 is SIN3, node 3 is UME6, and node 4 is HSF1. (B) Illustration on how to determine a generalized hierarchy using our BFS-level method. (I) A toy example with all four motifs mentioned in A. Each color represents a motif (color coding is the same as in A). (II) Finding all of the bottom (terminal) nodes in the network. A TF is a bottom node if and only if it does not regulate other TFs. TFs that only regulate themselves (i.e., autoregulation) are also considered as bottom nodes. All bottom nodes in the network are colored red. (III) Finding midlevel nodes. One performs a one-level deep BFS search starting at each of the bottom nodes to find what regulates them. Direct regulators of all bottom nodes are considered as level-2 nodes, which are in green. (IV) Finding topmost nodes. The procedure in the previous step (III) is repeated until all levels are determined. We call this overall process BFS-level. In this toy example, there are only three levels, and the node at the top level is in blue. However, in the yeast regulatory network, there are four levels.
Fig. 2.
Fig. 2.
Common characteristics of the hierarchical structures between regulatory networks and the Macao governmental organization. (A) Illustration of the yeast regulatory hierarchy in S. cerevisiae. The light blue arc arrows indicate the regulations between TFs at the same level. Many of these regulations are involved in loop structures (feed-forward and multicomponent loops). (B) Illustration of the Macao governmental hierarchy. The bottom layer consists of people who do not manage anyone based on the available information, which are similar to the non-TFs in yeast. Therefore, level 1 of the hierarchy consists of people managing those at the bottom. (C) Illustration of the regulatory hierarchy in E. coli. Average out-degree and total number of nodes at different levels are shown parallel to the hierarchies. P values in A and C were calculated by using the Student t tests to compare the average out-degree of level-1 TFs with that of the TFs at other levels. (D) Average betweenness at each level of the year hierarchy. P values were calculated by using the Student t tests to compare the average betweenness of the top and bottom TFs with that of the middle-level TFs. (E) Comparison between yeast regulatory and randomized networks.
Fig. 3.
Fig. 3.
A biological example to illustrate the multistep cogitation processes in the regulatory hierarchy, showing aerobic growth mediated by Mot3. We divided the image into two parts, nucleus and cytoplasm, because TFs only function in the nucleus, whereas other proteins (such as the enzymes Put1, Put2, Uga1, Uga2, and Uga3) normally function in the cytoplasm.
Fig. 4.
Fig. 4.
Correlations between levels in the hierarchy and other topological and functional properties. (A and B) Average number of interaction partners (A) and average closeness (B) for TFs at each level. P values were calculated with Student's t tests to compare the top bar with the sum of the test bars. (C) Enrichment of functional categories relative to level 1. For each functional category in the Munich Information Center for Protein Sequences (MIPS) functional classification schemes, we calculated the percentage of interaction partners of TFs that have this function. The percentage of a certain category was then normalized against the corresponding one at level 1. Thus, all bars at level 1 have a value of 1. Because we were analyzing the transcriptional regulatory networks, we ignored the functional category “transcription.” P values were calculated with cumulative binomial distributions to compare the statistical significance of enrichment at level 4 to that of the sum of the other levels (see Supporting Text).
Fig. 5.
Fig. 5.
A biological example to illustrate that the top-level TFs receive internal and external signals through protein–protein interaction, showing unfolded protein response mediated by Ire1.
Fig. 6.
Fig. 6.
Correlations between levels in the hierarchy and other biological properties. (A) Deletion of TFs at higher levels disrupts the expression of more genes. A gene is defined as disrupted if P is <0.05 determined by Rosetta knockout experiments (47). Because the knockout experiments were only performed on 41 TFs, t tests cannot be performed to examine the statistical significance of the differences between the average numbers of affected genes across different levels. Therefore, we performed a χ2 test and found that deletion of TFs at higher levels disrupts the expression of more genes, which is statistically significant when compared with random expectation (P < 10−45; see Supporting Text). (B) TFs at higher levels in the hierarchy have a strong tendency to have human homologs associated with cancer. P values measure the statistical significance between the fractions of human cancer gene homologs among TFs at a certain level with that at level 4. (C) TFs at the bottom of the yeast hierarchy have a strong tendency to be essential genes. P values measure the statistical significance between the fractions of essential genes among TFs at a certain level with that at level 2 and were calculated by using cumulative binomial distributions (see Supporting Text). (D) TFs at the bottom of the E. coli hierarchy have a strong tendency to be essential genes. All calculations are similar to those in C.

Similar articles

Cited by

References

    1. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, et al. Science. 2002;298:799–804. - PubMed
    1. Horak CE, Luscombe NM, Qian J, Bertone P, Piccirrillo S, Gerstein M, Snyder M. Genes Dev. 2002;16:3017–3033. - PMC - PubMed
    1. Jeong H, Mason SP, Barabasi AL, Oltvai ZN. Nature. 2001;411:41–42. - PubMed
    1. Qian J, Dolled-Filhart M, Lin J, Yu H, Gerstein M. J Mol Biol. 2001;314:1053–1066. - PubMed
    1. Albert R, Barabasi AL. Rev Mod Phys. 2002;74:47–97.

Publication types

MeSH terms

Substances