Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 1;25(23):3143-50.
doi: 10.1093/bioinformatics/btp551. Epub 2009 Sep 21.

How and when should interactome-derived clusters be used to predict functional modules and protein function?

Affiliations

How and when should interactome-derived clusters be used to predict functional modules and protein function?

Jimin Song et al. Bioinformatics. .

Abstract

Motivation: Clustering of protein-protein interaction networks is one of the most common approaches for predicting functional modules, protein complexes and protein functions. But, how well does clustering perform at these tasks?

Results: We develop a general framework to assess how well computationally derived clusters in physical interactomes overlap functional modules derived via the Gene Ontology (GO). Using this framework, we evaluate six diverse network clustering algorithms using Saccharomyces cerevisiae and show that (i) the performances of these algorithms can differ substantially when run on the same network and (ii) their relative performances change depending upon the topological characteristics of the network under consideration. For the specific task of function prediction in S.cerevisiae, we demonstrate that, surprisingly, a simple non-clustering guilt-by-association approach outperforms widely used clustering-based approaches that annotate a protein with the overrepresented biological process and cellular component terms in its cluster; this is true over the range of clustering algorithms considered. Further analysis parameterizes performance based on the number of annotated proteins, and suggests when clustering approaches should be used for interactome functional analyses. Overall our results suggest a re-examination of when and how clustering approaches should be applied to physical interactomes, and establishes guidelines by which novel clustering approaches for biological networks should be justified and evaluated with respect to functional analysis.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Performance as judged via three measures (Jaccard, PR and sDensity) of six clustering algorithms and OneCluster in how well they recapitulate (A) MIPS complexes, (B) BP modules and (C) CC modules from the HTP (top) and Y2H (bottom) S.cerevisiae networks.
Fig. 2.
Fig. 2.
Function prediction performance as protein annotations are removed. As BP (A) or CC (B) annotations are removed for 10%, 30%, 50%, 70% and 90% of the proteins in the HTP interaction network, the PR AUC of Neighborhood deteriorates more rapidly than that of any of the six clustering algorithms. The average PR AUC over 10 networks is plotted, with each error bar showing ±1SD from the average.

References

    1. Adamcsek B, et al. Cfinder: locating cliques and overlapping modules in biological networks. Bioinformatics. 2006;22:1021. - PubMed
    1. Altaf-Ul-Amin M, et al. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics. 2006;7:207. - PMC - PubMed
    1. Arnau V, et al. Iterative cluster analysis of protein interaction data. Bioinformatics. 2005;21:364–378. - PubMed
    1. Ashburner M, et al. Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000;25:25–29. - PMC - PubMed
    1. Asthana S, et al. Predicting protein complex membership using probabilistic network reliability. Genome Res. 2004;14:1170–1175. - PMC - PubMed

Publication types