Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Aug 4;20 Suppl 1(Suppl 1):i371-8.
doi: 10.1093/bioinformatics/bth920.

A two-stage classifier for identification of protein-protein interface residues

Affiliations

A two-stage classifier for identification of protein-protein interface residues

Changhui Yan et al. Bioinformatics. .

Abstract

Motivation: The ability to identify protein-protein interaction sites and to detect specific amino acid residues that contribute to the specificity and affinity of protein interactions has important implications for problems ranging from rational drug design to analysis of metabolic and signal transduction networks.

Results: We have developed a two-stage method consisting of a support vector machine (SVM) and a Bayesian classifier for predicting surface residues of a protein that participate in protein-protein interactions. This approach exploits the fact that interface residues tend to form clusters in the primary amino acid sequence. Our results show that the proposed two-stage classifier outperforms previously published sequence-based methods for predicting interface residues. We also present results obtained using the two-stage classifier on an independent test set of seven CAPRI (Critical Assessment of PRedicted Interactions) targets. The success of the predictions is validated by examining the predictions in the context of the three-dimensional structures of protein complexes.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
The schematic diagram of the two-stage classifier.
Fig. 2.
Fig. 2.
The likelihood that positions neighboring interface residues also contains interface residues. Position 0 is an interface residue. Negative positions are on the N-terminal side of this target residue, positive positions are on the C-terminal. Positive likelihood means that the position has higher probability than random of also being an interface residue.
Fig. 3.
Fig. 3.
Representative prediction results on the 77 proteins. The target protein (for which the predictions are made) in each complex is shown in green, with residues of interest shown in spacefill and color coded as follows: red, interface residues identified as such by the classifier (TPs); yellow, interface residues missed by the classifier (FPs); and blue, residues incorrectly classified as interface residues (FPs). For clarity, interface residues for the partner protein in each complex (gray wireframe) are not shown. (A1) and (B1) are the predictions of SVM method. (A2) and (B2) are the corresponding predictions of two-stage method on the same proteins. A1, A2: predictions on BARSTAR from PDB 1brs; B1, B2: predictions on SEB from PDB 1seb; structure diagrams were generated using RasMol (http://www.openrasmol.org/).
Fig. 4.
Fig. 4.
Specificity+ versus sensitivity+ plot of the two-stage method.
Fig. 5.
Fig. 5.
Test results on Fab HC63 in CAPRI target 03. Fab HC63 is shown in green, with residues of interest shown in spacefill and color coded as follows: red, TPs; yellow, FPs; and blue, FPs. For clarity, interface residues for hemagglutinin (gray wireframe) are not shown. Structure diagrams were generated using RasMol (http://www.openrasmol.org/).

References

    1. Baldi P, Brunak S, Chauvin Y and Andersen CAF (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics, 16, 412–424. - PubMed
    1. Buntine W (1991) Theory refinement on Bayesian networks. Proceedings of Seventh Conference on Uncertainty in Artificial Intelligence. Morgan-Kaufmann, San Francisco, CA, USA. pp. 52–60.
    1. Chakrabarti P and Janin J (2002) Dissecting protein–protein recognition sites. J. Mol. Biol, 272, 132–143. - PubMed
    1. Chothia C and Janin J (1975) Principles of protein–protein recognition. Nature, 256, 705–708. - PubMed
    1. Eisenberg D, Schwarz E, Komaromy M and Wall R (1984) Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J. Mol. Biol, 179, 125–142. - PubMed

Publication types

LinkOut - more resources