Filtering high-throughput protein-protein interaction data using a combination of genomic features
- PMID: 15833142
- PMCID: PMC1127019
- DOI: 10.1186/1471-2105-6-100
Filtering high-throughput protein-protein interaction data using a combination of genomic features
Abstract
Background: Protein-protein interaction data used in the creation or prediction of molecular networks is usually obtained from large scale or high-throughput experiments. This experimental data is liable to contain a large number of spurious interactions. Hence, there is a need to validate the interactions and filter out the incorrect data before using them in prediction studies.
Results: In this study, we use a combination of 3 genomic features -- structurally known interacting Pfam domains, Gene Ontology annotations and sequence homology -- as a means to assign reliability to the protein-protein interactions in Saccharomyces cerevisiae determined by high-throughput experiments. Using Bayesian network approaches, we show that protein-protein interactions from high-throughput data supported by one or more genomic features have a higher likelihood ratio and hence are more likely to be real interactions. Our method has a high sensitivity (90%) and good specificity (63%). We show that 56% of the interactions from high-throughput experiments in Saccharomyces cerevisiae have high reliability. We use the method to estimate the number of true interactions in the high-throughput protein-protein interaction data sets in Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens to be 27%, 18% and 68% respectively. Our results are available for searching and downloading at http://helix.protein.osaka-u.ac.jp/htp/.
Conclusion: A combination of genomic features that include sequence, structure and annotation information is a good predictor of true interactions in large and noisy high-throughput data sets. The method has a very high sensitivity and good specificity and can be used to assign a likelihood ratio, corresponding to the reliability, to each interaction.
Figures





Similar articles
-
Genome-scale gene function prediction using multiple sources of high-throughput data in yeast Saccharomyces cerevisiae.OMICS. 2004 Winter;8(4):322-33. doi: 10.1089/omi.2004.8.322. OMICS. 2004. PMID: 15703479
-
AVID: an integrative framework for discovering functional relationships among proteins.BMC Bioinformatics. 2005 Jun 1;6:136. doi: 10.1186/1471-2105-6-136. BMC Bioinformatics. 2005. PMID: 15929793 Free PMC article.
-
VisANT: an online visualization and analysis tool for biological interaction data.BMC Bioinformatics. 2004 Feb 19;5:17. doi: 10.1186/1471-2105-5-17. BMC Bioinformatics. 2004. PMID: 15028117 Free PMC article.
-
Conservation of protein-protein interactions - lessons from ascomycota.Trends Genet. 2004 Feb;20(2):72-6. doi: 10.1016/j.tig.2003.12.007. Trends Genet. 2004. PMID: 14746987 Review.
-
The Cartographers toolbox: building bigger and better human protein interaction networks.Brief Funct Genomic Proteomic. 2009 Jan;8(1):1-11. doi: 10.1093/bfgp/elp003. Epub 2009 Mar 12. Brief Funct Genomic Proteomic. 2009. PMID: 19282470 Review.
Cited by
-
AtPID: Arabidopsis thaliana protein interactome database--an integrative platform for plant systems biology.Nucleic Acids Res. 2008 Jan;36(Database issue):D999-1008. doi: 10.1093/nar/gkm844. Epub 2007 Oct 25. Nucleic Acids Res. 2008. PMID: 17962307 Free PMC article.
-
The MoVIN server for the analysis of protein interaction networks.BMC Bioinformatics. 2008 Mar 26;9 Suppl 2(Suppl 2):S11. doi: 10.1186/1471-2105-9-S2-S11. BMC Bioinformatics. 2008. PMID: 18387199 Free PMC article.
-
M-Finder: Uncovering functionally associated proteins from interactome data integrated with GO annotations.Proteome Sci. 2013 Nov 7;11(Suppl 1):S3. doi: 10.1186/1477-5956-11-S1-S3. Epub 2013 Nov 7. Proteome Sci. 2013. PMID: 24565382 Free PMC article.
-
Circular RNA ZBTB46 depletion alleviates the progression of Atherosclerosis by regulating the ubiquitination and degradation of hnRNPA2B1 via the AKT/mTOR pathway.Immun Ageing. 2023 Nov 21;20(1):66. doi: 10.1186/s12979-023-00386-0. Immun Ageing. 2023. PMID: 37990246 Free PMC article.
-
Probability weighted ensemble transfer learning for predicting interactions between HIV-1 and human proteins.PLoS One. 2013 Nov 18;8(11):e79606. doi: 10.1371/journal.pone.0079606. eCollection 2013. PLoS One. 2013. PMID: 24260261 Free PMC article.
References
-
- Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon D, Narayan V, Srinivasan M, Pochart P, Qureshi-Emili A, Li Y, Godwin B, Conover D, Kalbfleisch T, Vijayadamodar G, Yang M, Johnston M, Fields S, Rothberg JM. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature. 2000;403:623–627. doi: 10.1038/35001009. - DOI - PubMed
-
- Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hofert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002;415:141–147. doi: 10.1038/415141a. - DOI - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Molecular Biology Databases