Constructing a robust protein-protein interaction network by integrating multiple public databases
- PMID: 22165958
- PMCID: PMC3236850
- DOI: 10.1186/1471-2105-12-S10-S7
Constructing a robust protein-protein interaction network by integrating multiple public databases
Abstract
Background: Protein-protein interactions (PPIs) are a critical component for many underlying biological processes. A PPI network can provide insight into the mechanisms of these processes, as well as the relationships among different proteins and toxicants that are potentially involved in the processes. There are many PPI databases publicly available, each with a specific focus. The challenge is how to effectively combine their contents to generate a robust and biologically relevant PPI network.
Methods: In this study, seven public PPI databases, BioGRID, DIP, HPRD, IntAct, MINT, REACTOME, and SPIKE, were used to explore a powerful approach to combine multiple PPI databases for an integrated PPI network. We developed a novel method called k-votes to create seven different integrated networks by using values of k ranging from 1-7. Functional modules were mined by using SCAN, a Structural Clustering Algorithm for Networks. Overall module qualities were evaluated for each integrated network using the following statistical and biological measures: (1) modularity, (2) similarity-based modularity, (3) clustering score, and (4) enrichment.
Results: Each integrated human PPI network was constructed based on the number of votes (k) for a particular interaction from the committee of the original seven PPI databases. The performance of functional modules obtained by SCAN from each integrated network was evaluated. The optimal value for k was determined by the functional module analysis. Our results demonstrate that the k-votes method outperforms the traditional union approach in terms of both statistical significance and biological meaning. The best network is achieved at k = 2, which is composed of interactions that are confirmed in at least two PPI databases. In contrast, the traditional union approach yields an integrated network that consists of all interactions of seven PPI databases, which might be subject to high false positives.
Conclusions: We determined that the k-votes method for constructing a robust PPI network by integrating multiple public databases outperforms previously reported approaches and that a value of k=2 provides the best results. The developed strategies for combining databases show promise in the advancement of network construction and modeling.
Figures


Similar articles
-
MOfinder: a novel algorithm for detecting overlapping modules from protein-protein interaction network.J Biomed Biotechnol. 2012;2012:103702. doi: 10.1155/2012/103702. Epub 2012 Feb 8. J Biomed Biotechnol. 2012. PMID: 22500072 Free PMC article.
-
A novel subgradient-based optimization algorithm for blockmodel functional module identification.BMC Bioinformatics. 2013;14 Suppl 2(Suppl 2):S23. doi: 10.1186/1471-2105-14-S2-S23. Epub 2013 Jan 21. BMC Bioinformatics. 2013. PMID: 23368964 Free PMC article.
-
APPINetwork: an R package for building and computational analysis of protein-protein interaction networks.PeerJ. 2022 Nov 4;10:e14204. doi: 10.7717/peerj.14204. eCollection 2022. PeerJ. 2022. PMID: 36353604 Free PMC article.
-
Identifying protein complexes and functional modules--from static PPI networks to dynamic PPI networks.Brief Bioinform. 2014 Mar;15(2):177-94. doi: 10.1093/bib/bbt039. Epub 2013 Jun 18. Brief Bioinform. 2014. PMID: 23780996 Review.
-
Discerning molecular interactions: A comprehensive review on biomolecular interaction databases and network analysis tools.Gene. 2018 Feb 5;642:84-94. doi: 10.1016/j.gene.2017.11.028. Epub 2017 Nov 10. Gene. 2018. PMID: 29129810 Review.
Cited by
-
Proceedings of the 2011 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) conference. Introduction.BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S1. doi: 10.1186/1471-2105-12-S10-S1. BMC Bioinformatics. 2011. PMID: 22165918 Free PMC article. No abstract available.
-
Network and Pathway Analysis of Toxicogenomics Data.Front Genet. 2018 Oct 22;9:484. doi: 10.3389/fgene.2018.00484. eCollection 2018. Front Genet. 2018. PMID: 30405693 Free PMC article.
-
Pathway mapping and development of disease-specific biomarkers: protein-based network biomarkers.J Cell Mol Med. 2015 Feb;19(2):297-314. doi: 10.1111/jcmm.12447. Epub 2015 Jan 5. J Cell Mol Med. 2015. PMID: 25560835 Free PMC article. Review.
-
An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.Artif Intell Med. 2014 Jun;61(2):63-78. doi: 10.1016/j.artmed.2014.03.003. Epub 2014 Mar 20. Artif Intell Med. 2014. PMID: 24726035 Free PMC article.
-
Systems biology approach reveals genome to phenome correlation in type 2 diabetes.PLoS One. 2013;8(1):e53522. doi: 10.1371/journal.pone.0053522. Epub 2013 Jan 7. PLoS One. 2013. PMID: 23308243 Free PMC article.
References
-
- Xu X, Yuruk N, Feng Z, Schweiger T. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge Discovery and Data Mining. San Jose, California, USA; 2007. SCAN: a structural clustering algorithm for networks; pp. 824–833.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources