Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2006 Mar 29;361(1467):507-17.
doi: 10.1098/rstb.2005.1807.

The origins and evolution of functional modules: lessons from protein complexes

Affiliations
Review

The origins and evolution of functional modules: lessons from protein complexes

Jose B Pereira-Leal et al. Philos Trans R Soc Lond B Biol Sci. .

Abstract

Modularity is an attribute of a system that can be decomposed into a set of cohesive entities that are loosely coupled. Many cellular networks can be decomposed into functional modules-each functionally separable from the other modules. The protein complexes in physical protein interaction networks are a good example of this, and here we focus on their origins and evolution. We investigate the emergence of protein complexes and physical interactions between proteins by duplication, and review other mechanisms. We dissect the dataset of protein complexes of known three-dimensional structure, and show that roughly 90% of these complexes contain contacts between identical proteins within the same complex. Proteins that are shared across different complexes occur frequently, and they tend to be essential genes more often than members of a single protein complex. We also provide a perspective on the evolutionary mechanisms driving the growth of other modular cellular networks such as transcriptional regulatory and metabolic networks.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Hierarchical modularity in biology. (a) Modularity at the protein level—proteins consist of modules formed by domains. The S. cerevisiae uridylate kinase [1ukz] contains a single P-loop containing nucleotide triphosphate hydrolase domain as defined in the SCOP database (Murzin et al. 1995; Andreeva et al. 2004). A domain from the same superfamily is also found in the multi-domain protein EF-TU from T. thermophilus [1exm], also shown as a red cartoon. The two other domains belong to the SCOP superfamilies Translation proteins (pink) and EF-TU/eEF-1alpha/eIF2-gamma C-terminal domain (violet). (b) Modularity at the cellular level—most proteins work in a cooperative manner with other proteins and form functional modules. Here we show three distinct types of functional modules: (i) a protein complex—the B. taurus ATP synthase [1e79]. Here six protein chains (three red and three orange) that all contain the P-loop containing nucleotide triphosphate hydrolase domain (cartoon representation) as well as several other domains, assemble and form a ring; (ii) a signalling pathway—mating response MAPK pathway in yeast (Schwartz & Madhani 2004); (iii) a metabolic pathway—mevalonate pathway. (c) The diversity of cellular types is a consequence of the distinct arrangement of modules at lower levels, and cells are themselves modules from which tissues are built, such as the dendritic cells (grey) and lymphocytes (green) shown here (microscopy image courtesy of Milka Sarris).
Figure 2
Figure 2
DNA-directed RNA polymerase in the three major branches of the tree of life. The bacterial RNA polymerase structure is from T. aquaticus [1iw7]. The eukaryotic RNA polymerase is from S. cerevisiae [1i50]. No structure is available for the archaeal RNA polymerase, but its subunit composition is known to be similar to the eukaryotic enzyme (Bell & Jackson 2001), so it is displayed as a transparent version of the eukaryotic protein complex. The three RNA polymerases exhibit striking structural and functional similarities (Woychik & Reinberg 2001). Structurally and functionally equivalent protein chains are shown in the same colour.
Figure 3
Figure 3
Widespread presence of homomeric interactions in protein complexes. Homomers, protein complexes formed from multiple copies of the same protein, represent the vast majority (69.8%) of the protein complexes of known three-dimensional structure, as defined in the PQS database (Henrick & Thornton 1998). Of the 30.2% complexes that are not pure homomers, 20% contain at least one interaction between identical chains. Only about 10% of the complexes do not have any homomeric interactions. Three structures exemplify each category. From left to right: the carbonic anhydrase from M. thermophila [1qrf] is a homotrimer, the P. putida 2-oxoisovalerate dehydrogenase [1ps0] is a heterotetramer of two homodimers and the cap-binding protein from H. sapiens [1n52] is a heterodimer. The interactions in each complex are shown as red (homomeric) or green (heteromeric) lines between the nodes in a two-dimensional graph representation of each complex, where nodes of the same shape are identical proteins. The set of protein complexes was filtered according to this graph representation of complexes, so that identical or similar complex structures were not counted multiple times.
Figure 4
Figure 4
Duplication of protein complexes. The S. cerevisiae adaptin complexes AP1 and AP2 [1w63, 1gw5] have homologous protein chains and are known to have arisen by gene duplication events. Two other homologous complexes, AP3 and AP4 (not shown), have emerged by duplication in a similar manner. Using a conservative definition of homology between complexes, the extent of duplication in three different sets of yeast protein complexes is between 7 and 20%, as shown in the bar chart. MIPS: the manually curated catalogue of protein complexes in the MIPS Yeast Genome Database (Mewes et al. 2002) TAP: a set of protein complexes identified by high-throughput purification of tagged proteins and mass spectrometry (Gavin et al. 2002). HMS-PCI: another set of protein complexes identified by large-scale experiments in yeast (Ho et al. 2002). Circles with error bars represent the random expectation (Pereira-Leal & Teichmann 2005).
Figure 5
Figure 5
Proteins in multiple complexes are often essential. Using the three protein complex datasets in S. cerevisiae we determined the probability that a protein is essential (p(E)) if it is part of a single complex (white bars) or multiple complexes (black bars).

References

    1. Aharoni A, Gaidukov L, Khersonsky O, Mc Q.G.S, Roodveldt C, Tawfik D.S. The ‘evolvability’ of promiscuous protein functions. Nat. Genet. 2005;37:73–76. - PubMed
    1. Amoutzias G.D, Robertson D.L, Oliver S.G, Bornberg-Bauer E. Convergent evolution of gene networks by single-gene duplications in higher eukaryotes. EMBO Rep. 2004;5:274–279. 10.1038/sj.embor.7400096 - DOI - PMC - PubMed
    1. Amoutzias G.D, Weiner J, Bornberg-Bauer E. Phylogenetic profiling of protein interaction networks in eukaryotic transcription factors reveals focal proteins being ancestral to hubs. Gene. 2005;347:247–253. 10.1016/j.gene.2004.12.031 - DOI - PubMed
    1. Andreeva A, Howorth D, Brenner S.E, Hubbard T.J, Chothia C, Murzin A.G. SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res. 2004;32:D226–D229. 10.1093/nar/gkh039 - DOI - PMC - PubMed
    1. Babu M.M, Luscombe N.M, Aravind L, Gerstein M, Teichmann S.A. Structure and evolution of transcriptional regulatory networks. Curr. Opin. Struct. Biol. 2004;14:283–291. 10.1016/j.sbi.2004.05.004 - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources