Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Dec;3(12):e239.
doi: 10.1371/journal.pcbi.0030239.

The modular organization of domain structures: insights into protein-protein binding

Affiliations

The modular organization of domain structures: insights into protein-protein binding

Antonio del Sol et al. PLoS Comput Biol. 2007 Dec.

Abstract

Domains are the building blocks of proteins and play a crucial role in protein-protein interactions. Here, we propose a new approach for the analysis and prediction of domain-domain interfaces. Our method, which relies on the representation of domains as residue-interacting networks, finds an optimal decomposition of domain structures into modules. The resulting modules comprise highly cooperative residues, which exhibit few connections with other modules. We found that non-overlapping binding sites in a domain, involved in different domain-domain interactions, are generally contained in different modules. This observation indicates that our modular decomposition is able to separate protein domains into regions with specialized functions. Our results show that modules with high modularity values identify binding site regions, demonstrating the predictive character of modularity. Furthermore, the combination of modularity with other characteristics, such as sequence conservation or surface patches, was found to improve our predictions. In an attempt to give a physical interpretation to the modular architecture of domains, we analyzed in detail six examples of protein domains with available experimental binding data. The modular configuration of the TEM1-beta-lactamase binding site illustrates the energetic independence of hotspots located in different modules and the cooperativity of those sited within the same modules. The energetic and structural cooperativity between intramodular residues is also clearly shown in the example of the chymotrypsin inhibitor, where non-binding site residues have a synergistic effect on binding. Interestingly, the binding site of the T cell receptor beta chain variable domain 2.1 is contained in one module, which includes structurally distant hot regions displaying positive cooperativity. These findings support the idea that modules possess certain functional and energetic independence. A modular organization of binding sites confers robustness and flexibility to the performance of the functional activity, and facilitates the evolution of protein interactions.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Similarity in Modular Composition and Relative Interface Between Binding Sites
The Kringle domain (Pfam ID: PF00051; PDB ID: 1bht) has been chosen as an illustrative example of modular separation of binding sites. The two binding sites A (blue) and B (red) of this domain are represented in spacefill, with their interface residues depicted in balls and sticks. The interface between binding sites A (ten residues) and B (eight residues) involves four and three residues from each binding site, respectively. The relative interface between these binding sites is C(A,B) = 0.39 (see Materials and Methods). The domain has been decomposed into five modules represented by the colored surfaces: 1 (green), 2 (yellow), 3 (olive), 4 (purple), and 5 (cyan). The modular composition of binding sites A and B are (2,8,0,0,0) and (0,2,3,3,0), respectively. The similarity in modular composition of these binding sites is M(A,B) = 0.20.
Figure 2
Figure 2. Relationship Between Relative Interface and Similarity in Modular Composition and its Statistical Validation
(A) Correlation between relative interface and similarity in modular composition between pairs of domain binding sites. The linear regression line corresponds to the correlation coefficient r = 0.86 with a statistically significant p = 5.89 × 10−34. (B) Z-score frequency distribution of the correlation coefficient r for all pairs of randomly generated domain binding sites. The correlation coefficients are mainly distributed around 0, illustrating the independence between these two parameters for the random dataset, whereas the correlation for those pairs of domain binding sites in the analyzed dataset (indicated with the vertical arrow) has a statistically significant z-score = 6.0.
Figure 3
Figure 3. Modularity Distribution of Functional Modules and the Signal-to-Noise Ratio
(A) Comparison between modularity distributions for functional modules (including at least 10% of binding site residues) in the analyzed dataset and in the set of randomly generated binding sites. In the analyzed dataset, a large percentage (72%) of modules exhibiting statistically significant values of modularity (z-score ≥ 2.0) correspond to functional modules, whereas this tendency is not observed in the random case. (B) Ratio between modularity distributions for the analyzed dataset and the random dataset. The ratio is significantly greater than one where z-score values are greater or equal than 2.0.
Figure 4
Figure 4. Distribution of Binding Site Residues in Modules
Percentages of modules (y-axis) containing at least the fraction of binding site residues indicated on the x-axis. In the dataset, more than 75% of modules contain at least 10% of binding site residues.
Figure 5
Figure 5. Accuracy and Coverage for Different Methods
Accuracy and coverage values calculated for the functionally predicted modules based on modularity, sequence conservation, and surface patches. These values are also represented for the combination of modularity with sequence conservation and surface patches.
Figure 6
Figure 6. Examples of Modular Configuration of Domain Binding Sites
(A) Modular decomposition of the IL-4 domain binding site. The modular decomposition of the IL-4 domain is represented by the colored surface. The binding site of the interaction with its receptor subunit IL-4Rα is configured by three clusters that contribute independently to the binding free energy. The three clusters are respectively located in three different modules. (1) Cluster I is in the green module (I5, T6, E9, K12, T13); (2) cluster II is in the blue module (R53, Y56, R88); and (3) cluster III is in the olive module (Q78, R81, F82). Residues E9 and R88 are the two main hotspots of binding free energy. PDB ID: 2b8u, chain A. (B) Modular decomposition of the TEM1 domain binding site. The ribbon representation is color-coded according to the modular decomposition of the TEM1 domain. The binding site of the interaction with its inhibitor BLIP contains two independent hot regions of binding free energy, which are located in two different modules: (1) red module (S130, K234, S235, R243); and (2) blue module (E104, Y105). PDB ID: 1jtg, chain A. (C) Distant cooperative hot regions within the same module in TCR hVβ2.1. Surface of TCR hVβ2.1 is colored according to its modular decomposition. The two distant cooperative hot regions of binding free energy for the interactions with the superantigen TSST-1 are located in CDR2 (E51, S52a, K53) and FR3 (E61, K62). Both regions are located in the same module (green). PDB ID: 1ktk, chain E. (D) Modular decomposition of hGHbp. Color-coded backbone representation of the modular decomposition of hGHbp. The three clusters in the hydrophilic periphery of the functional epitope, which contribute independently to the binding free energy, are located in three different modules: (1) E120, K121; (2) S98, S102; and (3) Q166, K167, V171. PDB ID: 3hhr, chain B. (E) Modular decomposition of CI2. Representation of the CI2–subtisilin Novo complex. The modular decomposition of CI2 is depicted by color-coded ribbons. Residues R65, R67, T58, E60, and Y61, which display structural and energetic cooperativity, are located within the red module. PDB ID: 2sni, chain I. (F) Modular decomposition of RI. The modular decomposition of RI is represented by the colored surface. Cooperative residues W261, W263, and W318 of the Trp-rich area are contained in the green module, whereas W375, whose contribution to the binding energy is additive with respect to the other tryptophans, belongs to the yellow module. The hotspot region 434–438 is located within the cyan module. PDB ID: 1a4y, chain D.

References

    1. Pawson T, Nash P. Assembly of cell regulatory systems through protein interaction domains. Science. 2003;300:445–452. - PubMed
    1. Aasland R, Abrams C, Ampe C, Ball LJ, Bedford MT, et al. Normalization of nomenclature for peptide motifs as ligands of modular protein domains. FEBS Lett. 2002;513:141–144. - PubMed
    1. Bornberg-Bauer E, Beaussart F, Kummerfeld SK, Teichmann SA, Weiner J., III The evolution of domain arrangements in proteins and interaction networks. Cell Mol Life Sci. 2005;62:435–445. - PMC - PubMed
    1. Itzhaki Z, Akiva E, Altuvia Y, Margalit H. Evolutionary conservation of domain–domain interactions. Genome Biol. 2006;7:R125. - PMC - PubMed
    1. Riley R, Lee C, Sabatti C, Eisenberg D. Inferring protein domain interactions from databases of interacting proteins. Genome Biol. 2005;6:R89. - PMC - PubMed