Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 May 12;6(5):e19554.
doi: 10.1371/journal.pone.0019554.

Structural similarity and classification of protein interaction interfaces

Affiliations

Structural similarity and classification of protein interaction interfaces

Nan Zhao et al. PLoS One. .

Abstract

Interactions between proteins play a key role in many cellular processes. Studying protein-protein interactions that share similar interaction interfaces may shed light on their evolution and could be helpful in elucidating the mechanisms behind stability and dynamics of the protein complexes. When two complexes share structurally similar subunits, the similarity of the interaction interfaces can be found through a structural superposition of the subunits. However, an accurate detection of similarity between the protein complexes containing subunits of unrelated structure remains an open problem. Here, we present an alignment-free machine learning approach to measure interface similarity. The approach relies on the feature-based representation of protein interfaces and does not depend on the superposition of the interacting subunit pairs. Specifically, we develop an SVM classifier of similar and dissimilar interfaces and derive a feature-based interface similarity measure. Next, the similarity measure is applied to a set of 2,806×2,806 binary complex pairs to build a hierarchical classification of protein-protein interactions. Finally, we explore case studies of similar interfaces from each level of the hierarchy, considering cases when the subunits forming interactions are either homologous or structurally unrelated. The analysis has suggested that the positions of charged residues in the homologous interfaces are not necessarily conserved and may exhibit more complex conservation patterns.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. A protocol for obtaining a reliable set of similar and dissimilar interface pairs.
First, two structure-based similarity measures, iiRMSD and siRMSD, are evaluated on a dataset collected from 3D Complex database. Second, a non-redundant domain-domain interaction data set is obtained from PDB, SCOP and CATH. Third, iiRMSD is used to classify positive (similar) and negative (dissimilar) training sets of pairs of interaction interface structures.
Figure 2
Figure 2. An overview of machine learning approach to determine interface similarity measure.
First, interface structures are extracted from the training sets of similar and dissimilar interaction interfaces. Second, for each pair of interfaces a 106-dimensional feature vector is calculated. Third, a Support Vector Machines classifier is trained and evaluated using the above datasets. Last, a protein interface similarity measure δ(I1, I2) is defined for two interfaces, I1 and I2, as the distance between the corresponding106-dimensional feature vector and the separating hyperplane.
Figure 3
Figure 3. Hierarchical classification of interaction interfaces.
Similar shapes correspond to homologous proteins. Three levels of structurally similar interaction interfaces are defined. A single cluster at H-level, C-level, and A-level can include homologous, common partner analogous, and analogous interfaces, correspondingly.
Figure 4
Figure 4. Histograms of the distributions of (A) iiRMSD and (B) siRMSD values on the datasets of similar and dissimilar interfaces.
Both datasets are obtained from 3D Complex database. On average, the dissimilar interface pairs had larger iiRMSD and siRMSD values (mean values are 20.6 and 15.8, correspondingly) than similar pairs (mean values are 14.8 and 14.7). In addition, the mean value difference between the similar and dissimilar interfaces was larger when using the iiRMSD measure (Δμ is 4.7 for iiRMSD and 1.1 for siRMSD).
Figure 5
Figure 5. Distribution of SCOP class ID pairs from the training dataset of protein-protein interactions.
The dataset covers all SCOP class IDs, while the uneven distribution of the pairs is consistent with the unevenness in the overall distribution of protein structures across the SCOP classes.
Figure 6
Figure 6. Average Silhouette value against different number of clusters (K).
An obvious knee point (K = 140) is selected as the number of clusters.
Figure 7
Figure 7. Case studies of similar interactions.
(A) H-level interactions (iiRMSD = 2.93 Å), (B) C-level interactions (iiRMSD = 6.12 Å), and (C) A-level interactions (iiRMSD = 6.19 Å). Subunits from the first interaction together with the corresponding interface and binding sites are colored gold and light yellow. Subunits from the second interaction (and their interfaces and binding sites) are colored dark and light grey. Positively and negatively charged residues in the first interaction are colored blue and red, while in the second interaction they are colored cyan and magenta, correspondingly. Superposition refers to the superposed interactions, interfaces, and binding sites.

Similar articles

Cited by

References

    1. Alberts B. Essential cell biology : an introduction to the molecular biology of the cell. New York: Garland Pub; 1998. p. 1 v. (various pagings).
    1. Aloy P, Ceulemans H, Stark A, Russell RB. The relationship between sequence and interaction divergence in proteins. J Mol Biol. 2003;332:989–998. - PubMed
    1. Keskin O, Nussinov R, Gursoy A. PRISM: protein-protein interaction prediction by structural matching. Methods Mol Biol. 2008;484:505–521. - PMC - PubMed
    1. Belyaeva OV, Korkina OV, Stetsenko AV, Kim T, Nelson PS, et al. Biochemical properties of purified human retinol dehydrogenase 12 (RDH12): catalytic efficiency toward retinoids and C9 aldehydes and effects of cellular retinol-binding protein type I (CRBPI) and cellular retinaldehyde-binding protein (CRALBP) on the oxidation and reduction of retinoids. Biochemistry. 2005;44:7035–7047. - PMC - PubMed
    1. Abbasi I, Githure J, Ochola JJ, Agure R, Koech DK, et al. Diagnosis of Wuchereria bancrofti infection by the polymerase chain reaction employing patients' sputum. Parasitol Res. 1999;85:844–849. - PubMed

Publication types