Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Dec 22:9:553.
doi: 10.1186/1471-2105-9-553.

Prediction of protein-protein binding site by using core interface residue and support vector machine

Affiliations

Prediction of protein-protein binding site by using core interface residue and support vector machine

Nan Li et al. BMC Bioinformatics. .

Abstract

Background: The prediction of protein-protein binding site can provide structural annotation to the protein interaction data from proteomics studies. This is very important for the biological application of the protein interaction data that is increasing rapidly. Moreover, methods for predicting protein interaction sites can also provide crucial information for improving the speed and accuracy of protein docking methods.

Results: In this work, we describe a binding site prediction method by designing a new residue neighbour profile and by selecting only the core-interface residues for SVM training. The residue neighbour profile includes both the sequential and the spatial neighbour residues of an interface residue, which is a more complete description of the physical and chemical characteristics surrounding the interface residue. The concept of core interface is applied in selecting the interface residues for training the SVM models, which is shown to result in better discrimination between the core interface and other residues. The best SVM model trained was tested on a test set of 50 randomly selected proteins. The sensitivity, specificity, and MCC for the prediction of the core interface residues were 60.6%, 53.4%, and 0.243, respectively. Our prediction results on this test set were compared with other three binding site prediction methods and found to perform better. Furthermore, our method was tested on the 101 unbound proteins from the protein-protein interaction benchmark v2.0. The sensitivity, specificity, and MCC of this test were 57.5%, 32.5%, and 0.168, respectively.

Conclusion: By improving both the descriptions of the interface residues and their surrounding environment and the training strategy, better SVM models were obtained and shown to outperform previous methods. Our tests on the unbound protein structures suggest further improvement is possible.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison of the amino acid compositions of the neighbour residues for the three residue classes. In this figure, the amino acid compositions of the neighbour residues for the core interface, the rim interface, and the non-interface residues are compared. Colour black, red, and blue represent the core interface, the rim interface and the non-interface residues, respectively. a) Core cut-off equals to 0.2. b) Core cut-off equals to 0.8.
Figure 2
Figure 2
The secondary structure compositions of the three residue classes. In this figure, the secondary structure compositions of the core interface, the rim interface, and the non-interface residue are compared. The bars in black, red, and blue represent the percentage of helix, sheet, and coil in the three residue classes, respectively. a) Core cut-off equals to 0.2. b) Core cut-off equals to 0.8.
Figure 3
Figure 3
Comparison of the atom compositions of the neighbour residues for the three residue classes. In this figure, the atom compositions of the neighbour residues for the core interface, the rim interface, and the non-interface residues are compared. Colour black, red, and blue represent the core interface, the rim interface and the non-interface residues, respectively. The details of the 18 atom types can be found in the Additional file 2. a) Core cut-off equals to 0.2. b) Core cut-off equals to 0.8.
Figure 4
Figure 4
The ROC curves for different models. In this figure, the ROC curves for different SVM models are presented. The gray curve is generated using models to discriminate interface from non-interface residues. The red, blue, and pink curves are generated using models to discriminate core interface from other residues. The core cut-offs for red, blue, and pink curves are 0.2, 0.5, and 0.8 respectively. The AUC for the gray, red, blue, and pink curves are 0.7385, 0.7498, 0.8184, and 0.9169 respectively.
Figure 5
Figure 5
The definition of the six local environment classes. This figure shows the classification method of side-chain environment. RASA stands for the relative accessible surface area and FP stands for the fraction of surface area of polar atoms in the surface area of the whole side-chain. If RASA ≥ 0.36, the residue will be divided into class E (exposed). If 0.09 ≤ RASA < 0.36, the residue will be divided into class P (partial buried). Within class P, if FP < 0.67, the residue will be class P1, and if FP ≥ 0.67, the residue will be class P2. If RASA < 0.09, the residue will be divided into class B (buried). In class B, if FP < 0.45, the residue will be class B1, if 0.45 ≤ FP < 0.58, the residue will be class B2, and if FP ≥ 0.58, the residue will be class B3.

References

    1. van Dijk ADJ, Boelens R, Bonvin AMJJ. Data-driven docking for the study of biomolecular complexes. FEBS Journal. 2005;272:293–312. doi: 10.1111/j.1742-4658.2004.04473.x. - DOI - PubMed
    1. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrin-Alvarez JM, Shales M, Zhang X, Davery M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso SR, Onge PS, Ghanny S, Lam MH, Butland G, Altaf-UI AM, Kanaya S, Shilatifard A, O'Shea Weissman JS, Ingles CJ, Heghes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006;440:637–643. doi: 10.1038/nature04670. - DOI - PubMed
    1. Phizicky E, Bastiaens PI, Zhu H, Snyder M, Fields S. Protein analysis on a proteomic scale. Nature. 2003;422:208–215. doi: 10.1038/nature01512. - DOI - PubMed
    1. Smith JR, Sternberg MJ. Prediction of protein-protein interactions by docking methods. Curr Opin Struct Biol. 2002;12:28–35. doi: 10.1016/S0959-440X(02)00285-3. - DOI - PubMed
    1. Lensink MF, Mendez R, Wodak SJ. Docking and scoring protein complexes: CAPRI 3rd edition. Proteins. 2007;69:704–718. doi: 10.1002/prot.21804. - DOI - PubMed

Publication types