Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct 18;12 Suppl 10(Suppl 10):S7.
doi: 10.1186/1471-2105-12-S10-S7.

Constructing a robust protein-protein interaction network by integrating multiple public databases

Affiliations

Constructing a robust protein-protein interaction network by integrating multiple public databases

Venkata-Swamy Martha et al. BMC Bioinformatics. .

Abstract

Background: Protein-protein interactions (PPIs) are a critical component for many underlying biological processes. A PPI network can provide insight into the mechanisms of these processes, as well as the relationships among different proteins and toxicants that are potentially involved in the processes. There are many PPI databases publicly available, each with a specific focus. The challenge is how to effectively combine their contents to generate a robust and biologically relevant PPI network.

Methods: In this study, seven public PPI databases, BioGRID, DIP, HPRD, IntAct, MINT, REACTOME, and SPIKE, were used to explore a powerful approach to combine multiple PPI databases for an integrated PPI network. We developed a novel method called k-votes to create seven different integrated networks by using values of k ranging from 1-7. Functional modules were mined by using SCAN, a Structural Clustering Algorithm for Networks. Overall module qualities were evaluated for each integrated network using the following statistical and biological measures: (1) modularity, (2) similarity-based modularity, (3) clustering score, and (4) enrichment.

Results: Each integrated human PPI network was constructed based on the number of votes (k) for a particular interaction from the committee of the original seven PPI databases. The performance of functional modules obtained by SCAN from each integrated network was evaluated. The optimal value for k was determined by the functional module analysis. Our results demonstrate that the k-votes method outperforms the traditional union approach in terms of both statistical significance and biological meaning. The best network is achieved at k = 2, which is composed of interactions that are confirmed in at least two PPI databases. In contrast, the traditional union approach yields an integrated network that consists of all interactions of seven PPI databases, which might be subject to high false positives.

Conclusions: We determined that the k-votes method for constructing a robust PPI network by integrating multiple public databases outperforms previously reported approaches and that a value of k=2 provides the best results. The developed strategies for combining databases show promise in the advancement of network construction and modeling.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Network modeling and evaluation flowchart PPI data are taken from seven preprocessed public PPI databases and used to create seven integrated networks using the k-vote method (A). SCAN is used to generate functional modules for each of these integrated networks. (B). Statistical and pathway analyses are performed on these functional modules to assess the networks (C).
Figure 2
Figure 2
Optimality measures for the seven consensus networks Figure 2A shows the four optimality measures for Ĝ2: modularity, similarity-based modularity, clustering score, and enrichment score. Figure 2B shows the same measures for the other 6 consensus networks. The value of optimality measures and the corresponding ε values are plotted on the y-axes and x-axes, respectively.

Similar articles

Cited by

References

    1. Bonetta L. Interactome under construction. Nature. 2010;468(7325):851–854. doi: 10.1038/468851a. - DOI - PubMed
    1. Ahn YY, Bagrow JP, Lehmann S. Link communities reveal multiscale complexity in networks. Nature. 2010;466(7307):761–U711. doi: 10.1038/nature09182. - DOI - PubMed
    1. Cerami EG, Gross BE, Demir E, Rodchenkov I, Babur O, Anwar N, Schultz N, Bader GD, Sander C. Pathway Commons, a web resource for biological pathway data. Nucleic Acids Research. 2011;39:D685–D690. doi: 10.1093/nar/gkq1039. - DOI - PMC - PubMed
    1. Kamburov A, Pentchev K, Galicka H, Wierling C, Lehrach H, Herwig R. ConsensusPathDB: toward a more complete picture of cell biology. Nucleic Acids Research. 2011;39:D712–D717. doi: 10.1093/nar/gkq1156. - DOI - PMC - PubMed
    1. Xu X, Yuruk N, Feng Z, Schweiger T. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge Discovery and Data Mining. San Jose, California, USA; 2007. SCAN: a structural clustering algorithm for networks; pp. 824–833.

Publication types

LinkOut - more resources