Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 19;11(1):16910.
doi: 10.1038/s41598-021-96265-z.

Robust and accurate prediction of protein-protein interactions by exploiting evolutionary information

Affiliations

Robust and accurate prediction of protein-protein interactions by exploiting evolutionary information

Yang Li et al. Sci Rep. .

Abstract

Various biochemical functions of organisms are performed by protein-protein interactions (PPIs). Therefore, recognition of protein-protein interactions is very important for understanding most life activities, such as DNA replication and transcription, protein synthesis and secretion, signal transduction and metabolism. Although high-throughput technology makes it possible to generate large-scale PPIs data, it requires expensive cost of both time and labor, and leave a risk of high false positive rate. In order to formulate a more ingenious solution, biology community is looking for computational methods to quickly and efficiently discover massive protein interaction data. In this paper, we propose a computational method for predicting PPIs based on a fresh idea of combining orthogonal locality preserving projections (OLPP) and rotation forest (RoF) models, using protein sequence information. Specifically, the protein sequence is first converted into position-specific scoring matrices (PSSMs) containing protein evolutionary information by using the Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then we characterize a protein as a fixed length feature vector by applying OLPP to PSSMs. Finally, we train an RoF classifier for the purpose of identifying non-interacting and interacting protein pairs. The proposed method yielded a significantly better results than existing methods, with 90.07% and 96.09% prediction accuracy on Yeast and Human datasets. Our experiment show the proposed method can serve as a useful tool to accelerate the process of solving key problems in proteomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
The workflow of the proposed method.
Figure 2
Figure 2
The accuracy surface obtained from the RoF algorithm for optimizing parameters K and L.
Figure 3
Figure 3
ROC curves performed using the proposed method on Yeast dataset.
Figure 4
Figure 4
ROC curves performed using the proposed method on Human dataset.
Figure 5
Figure 5
ROC curves performed using the SVM method on Yeast dataset.
Figure 6
Figure 6
ROC curves performed using the SVM method on Human dataset.

References

    1. Zhang QC, et al. Structure-based prediction of protein–protein interactions on a genome-wide scale. Nature. 2012;490:556. doi: 10.1038/nature11503. - DOI - PMC - PubMed
    1. Ito T, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. 2001;98:4569–4574. doi: 10.1073/pnas.061034498. - DOI - PMC - PubMed
    1. Koegl M, Uetz P. Improving yeast two-hybrid screening systems. Brief. Funct. Genom. Proteomic. 2007;6:302–312. doi: 10.1093/bfgp/elm035. - DOI - PubMed
    1. Zhu H, Snyder M. Protein chip technology. Curr. Opin. Chem. Biol. 2003;7:55–63. doi: 10.1016/S1367-5931(02)00005-4. - DOI - PubMed
    1. Puig O, et al. The tandem affinity purification (TAP) method: A general procedure of protein complex purification. Methods. 2001;24:218–229. doi: 10.1006/meth.2001.1183. - DOI - PubMed

Publication types

Substances