Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences
- PMID: 20003442
- PMCID: PMC2803199
- DOI: 10.1186/1471-2105-10-419
Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences
Abstract
Background: Protein-protein interactions underlie many important biological processes. Computational prediction methods can nicely complement experimental approaches for identifying protein-protein interactions. Recently, a unique category of sequence-based prediction methods has been put forward--unique in the sense that it does not require homologous protein sequences. This enables it to be universally applicable to all protein sequences unlike many of previous sequence-based prediction methods. If effective as claimed, these new sequence-based, universally applicable prediction methods would have far-reaching utilities in many areas of biology research.
Results: Upon close survey, I realized that many of these new methods were ill-tested. In addition, newer methods were often published without performance comparison with previous ones. Thus, it is not clear how good they are and whether there are significant performance differences among them. In this study, I have implemented and thoroughly tested 4 different methods on large-scale, non-redundant data sets. It reveals several important points. First, significant performance differences are noted among different methods. Second, data sets typically used for training prediction methods appear significantly biased, limiting the general applicability of prediction methods trained with them. Third, there is still ample room for further developments. In addition, my analysis illustrates the importance of complementary performance measures coupled with right-sized data sets for meaningful benchmark tests.
Conclusions: The current study reveals the potentials and limits of the new category of sequence-based protein-protein interaction prediction methods, which in turn provides a firm ground for future endeavours in this important area of contemporary bioinformatics.
Figures



Similar articles
-
Imbalance Data Processing Strategy for Protein Interaction Sites Prediction.IEEE/ACM Trans Comput Biol Bioinform. 2021 May-Jun;18(3):985-994. doi: 10.1109/TCBB.2019.2953908. Epub 2021 Jun 3. IEEE/ACM Trans Comput Biol Bioinform. 2021. PMID: 31751283
-
A statistical model of protein sequence similarity and function similarity reveals overly-specific function predictions.PLoS One. 2009 Oct 21;4(10):e7546. doi: 10.1371/journal.pone.0007546. PLoS One. 2009. PMID: 19844580 Free PMC article.
-
An integrated approach to the prediction of domain-domain interactions.BMC Bioinformatics. 2006 May 25;7:269. doi: 10.1186/1471-2105-7-269. BMC Bioinformatics. 2006. PMID: 16725050 Free PMC article.
-
A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction.Curr Opin Struct Biol. 2005 Jun;15(3):285-9. doi: 10.1016/j.sbi.2005.05.011. Curr Opin Struct Biol. 2005. PMID: 15939584 Review.
-
Function prediction of uncharacterized proteins.J Bioinform Comput Biol. 2007 Feb;5(1):1-30. doi: 10.1142/s0219720007002503. J Bioinform Comput Biol. 2007. PMID: 17477489 Review.
Cited by
-
Protein-protein interaction prediction using a hybrid feature representation and a stacked generalization scheme.BMC Bioinformatics. 2019 Jun 10;20(1):308. doi: 10.1186/s12859-019-2907-1. BMC Bioinformatics. 2019. PMID: 31182027 Free PMC article.
-
Binding site prediction for protein-protein interactions and novel motif discovery using re-occurring polypeptide sequences.BMC Bioinformatics. 2011 Jun 2;12:225. doi: 10.1186/1471-2105-12-225. BMC Bioinformatics. 2011. PMID: 21635751 Free PMC article.
-
Evolving knowledge graph similarity for supervised learning in complex biomedical domains.BMC Bioinformatics. 2020 Jan 3;21(1):6. doi: 10.1186/s12859-019-3296-1. BMC Bioinformatics. 2020. PMID: 31900127 Free PMC article.
-
Mapping and identification of a potential candidate gene for a novel maturity locus, E10, in soybean.Theor Appl Genet. 2017 Feb;130(2):377-390. doi: 10.1007/s00122-016-2819-7. Epub 2016 Nov 10. Theor Appl Genet. 2017. PMID: 27832313
-
ProfPPIdb: Pairs of physical protein-protein interactions predicted for entire proteomes.PLoS One. 2018 Jul 18;13(7):e0199988. doi: 10.1371/journal.pone.0199988. eCollection 2018. PLoS One. 2018. PMID: 30020956 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources