Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Sep 29;2(9):e124.
doi: 10.1371/journal.pcbi.0020124. Epub 2006 Jul 31.

The many faces of protein-protein interactions: A compendium of interface geometry

Affiliations

The many faces of protein-protein interactions: A compendium of interface geometry

Wan Kyu Kim et al. PLoS Comput Biol. .

Abstract

A systematic classification of protein-protein interfaces is a valuable resource for understanding the principles of molecular recognition and for modelling protein complexes. Here, we present a classification of domain interfaces according to their geometry. Our new algorithm uses a hybrid approach of both sequential and structural features. The accuracy is evaluated on a hand-curated dataset of 416 interfaces. Our hybrid procedure achieves 83% precision and 95% recall, which improves the earlier sequence-based method by 5% on both terms. We classify virtually all domain interfaces of known structure, which results in nearly 6,000 distinct types of interfaces. In 40% of the cases, the interacting domain families associate in multiple orientations, suggesting that all the possible binding orientations need to be explored for modelling multidomain proteins and protein complexes. In general, hub proteins are shown to use distinct surface regions (multiple faces) for interactions with different partners. Our classification provides a convenient framework to query genuine gene fusion, which conserves binding orientation in both fused and separate forms. The result suggests that the binding orientations are not conserved in at least one-third of the gene fusion cases detected by a conventional sequence similarity search. We show that any evolutionary analysis on interfaces can be skewed by multiple binding orientations and multiple interaction partners. The taxonomic distribution of interface types suggests that ancient interfaces common to the three major kingdoms of life are enriched by symmetric homodimers. The classification results are online at http://www.scoppi.org.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Three Different Features Measuring the Similarity between Two Faces
(A) Two faces in I set domain family (green and magenta) interacting with fibroblast growth factor (gray) in different binding orientations. The faces of I set domains are shown in spheres. (B) IFT—the interface residues (uppercase) are mapped by ones and others (lowercase) as zeroes on the aligned sequences. The common patterns of interface residues are outlined with red boxes. The IFTs are simplified just to illustrate the minimal characteristics. In reality, the length of an IFT is the same as its aligned sequence. (C) Face overlap—the interface atoms are highlighted at the intersection of the two faces after superposition of two I set domains. (D) Face angle—the angle between the centres of the two faces and the common centre of the superposed I set domains (see Materials and Methods for details).
Figure 2
Figure 2. Receiver Operating Characteristic Diagrams of Interface Classification by Hierarchical Clustering Using Different Linkage Methods
Single linkage, red empty rectangle; average linkage, green empty triangle; complete linkage, blue cross. The recall and the precision by the IFT clustering method (filled rectangle) and by the hybrid method (filled circle) are shown together for comparison. (A) DA-based classification. (B) DO-based classification.
Figure 3
Figure 3. Diverse Modes of Binding Orientations between Interacting Families
The domains of one family are superposed at the centre. Some binding orientations are omitted for a clear view. (A) Long-chain cytokines (centre) and fibronectin type III (peripheral). (B) Extended AAA–ATPase domain (centre) and DNA polymerase III clamp loader subunits, C-terminal domain (peripheral).
Figure 4
Figure 4. The Number of Different Interface Types between a Pair of Families
Figure 5
Figure 5. The Growth of Structures, SCOP Family, Family Pairs, and Interface Types
Figure 6
Figure 6. The Relationship between the Number of Partner Families and the Number of Faces per Family
The datapoints are jittered slightly to show the points of the same value.
Figure 7
Figure 7. A Schematic Diagram of Genuine and Nongenuine Fusions, Where Two Domains Exist Both in a Fused Form and as Separate Proteins
(A) Fused form. (B,C) Separate proteins. A genuine gene fusion conserves the binding orientation (A and C) but nongenuine fusion does not (A and B). A spurious gene fusion case can be found in a homodimer of a multidomain protein, where P1Q1 and P2Q2 are the same protein of identical sequence (D).
Figure 8
Figure 8. Examples of Genuine Gene Fusion
(A) Heterodimeric interfaces between CO dehydrogenase ISP C-domain–like family (gray) and molybdenum cofactorbinding domain family (rainbow). The two families show conserved binding orientation. Left: fused domain pair from aldehyde oxidoreductase of Desulfovibrio desulfuricans. Centre: two separate molecules in CO dehydrogenase of Hydrogenophaga pseudoflava. Right: two CO dehydrogenase ISP C-domains and one molybdenum cofactor-binding domain in CO dehydrogenase from Oligotropha carboxidovorans showing one conserved and the other variable interface type. (B) Homodimeric interfaces between two alpha-D-mannose–specific plant lectin families. Left, fused domain pair of Scilla campanulata agglutinin. Right, two separate molecules in Allium sativum lectin.
Figure 9
Figure 9. Examples of Nongenuine Gene Fusion, Where a Domain Pair Associates in Different Binding Orientations between Intra- and Inter-Types, Where an Additional Interaction Partner Occupies the Same Face of One Domain in Inter-Type
Intra-types are shown on the left, and inter-types are shown on the right. The domains in red are all in parallel orientations. (A) Homodimers of ricin B-like domains (red and green) and a DNase I-like domain (yellow) in haemagglutinin component (HA1) of the progenitor toxin from Clostridium botulinum (left), and Haemophilus ducreyi cytolethal distending toxin (right). Whereas the ricin-like domains of the haemagglutinin component (HA1) of Clostridium botulinum progenitor toxin (left) function by binding of carbohydrates [61], a different association of these domains in the Haemophilus ducreyi holotoxin (right) gives rise to a completely different function. Here, both domains contribute to the formation of a groove that acts as a potential peptide binding site to initiate endocytosis of the holotoxin complex [62]. (B) Homodimers of extended AAATPase domain family (red and green) and a DNA polymerase III clamp loader subunits, C-terminal domain (yellow). The AAA-ATPase domains are known to couple ATP binding/hydrolysis to protein assembly/disassembly [63]. AAATPase domains associate in different orientations in ClpB protein, a molecular chaperone disaggregating stress-damaged proteins (left) [64] and in a DNA clamp loader complex (right). An additional domain, DNA polymerase III clamp loader subunits, C-terminal domain (yellow), is present and unique to clamp loaders (right) [65].
Figure 10
Figure 10. The Conservation of Residues on the Surface of Ran GTPase
The conservation score is derived from Consurf–HSSP and is color-coded, with blue being most variable and red most conserved [42]. (A,B) Front and back of the less-conserved interfaces between Ran GTPase and Ran-binding protein (RBP, gray) in Ran-binding domain complexed with Ran bound to a GTP analogue. (C,D) Front and back of the same Ran GTPase interacting with the RCC1 (gray) protein in the guanine nucleotide exchange on Ran by the regulator of chromosome condensation (RCC1). The highly conserved, prominent bulge protrudes the cleft between the homodimer of RCC1 proteins. The GTP-binding pocket of Ran GTPase is also well-conserved.
Figure 11
Figure 11. The Distribution of 127 Interfaces and Their Categories from 23 Family Pairs Common to All Three Kingdoms and Having Ten or More Species Diversity
The category of the interfaces are divided as homo and hetero. Sym-homo (symmetric homodimer) associates using the faces of the same type and asym-homo (asymmetric homodimer) using the faces of different types. The 20 common or ancient interfaces are mostly symmetric homodimeric.

References

    1. Inbar Y, Benyamini H, Nussinov R, Wolfson HJ. Protein structure prediction via combinatorial assembly of sub-structural units. Bioinformatics. 2003;19(Supplement 1):i158–168. - PubMed
    1. Lu L, Arakaki AK, Lu H, Skolnick J. Multimeric threading-based prediction of protein–protein interactions on a genomic scale: Application to the Saccharomyces cerevisiae proteome. Genome Res. 2003;13:1146–1154. - PMC - PubMed
    1. Aloy P, Ceulemans H, Stark A, Russell RB. The relationship between sequence and interaction divergence in proteins. J Mol Biol. 2003;332:989–998. - PubMed
    1. Aloy P, Bottcher B, Ceulemans H, Leutwein C, Mellwig C, et al. Structure-based assembly of protein complexes in yeast. Science. 2004;303:2026–2029. - PubMed
    1. Prabu MM, Suguna K, Vijayan M. Variability in quaternary association of proteins with the same tertiary fold: A case study and rationalization involving legume lectins. Proteins. 1999;35:58–69. - PubMed

Publication types