Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Dec 21:11:605.
doi: 10.1186/1471-2105-11-605.

New insights into protein-protein interaction data lead to increased estimates of the S. cerevisiae interactome size

Affiliations

New insights into protein-protein interaction data lead to increased estimates of the S. cerevisiae interactome size

Laure Sambourg et al. BMC Bioinformatics. .

Abstract

Background: As protein interactions mediate most cellular mechanisms, protein-protein interaction networks are essential in the study of cellular processes. Consequently, several large-scale interactome mapping projects have been undertaken, and protein-protein interactions are being distilled into databases through literature curation; yet protein-protein interaction data are still far from comprehensive, even in the model organism Saccharomyces cerevisiae. Estimating the interactome size is important for evaluating the completeness of current datasets, in order to measure the remaining efforts that are required.

Results: We examined the yeast interactome from a new perspective, by taking into account how thoroughly proteins have been studied. We discovered that the set of literature-curated protein-protein interactions is qualitatively different when restricted to proteins that have received extensive attention from the scientific community. In particular, these interactions are less often supported by yeast two-hybrid, and more often by more complex experiments such as biochemical activity assays. Our analysis showed that high-throughput and literature-curated interactome datasets are more correlated than commonly assumed, but that this bias can be corrected for by focusing on well-studied proteins. We thus propose a simple and reliable method to estimate the size of an interactome, combining literature-curated data involving well-studied proteins with high-throughput data. It yields an estimate of at least 37, 600 direct physical protein-protein interactions in S. cerevisiae.

Conclusions: Our method leads to higher and more accurate estimates of the interactome size, as it accounts for interactions that are genuine yet difficult to detect with commonly-used experimental assays. This shows that we are even further from completing the yeast interactome map than previously expected.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Increased coverage by literature-curated datasets of interactions that are easier to detect by Y2H. The proportion of Ito interactions present in LowBP-LC and in LowBP-LC-pre2000 (literature-curated interactions reported before 2000) is plotted as a function of the number of IST hits. Each point represents at least 200 interactions, and the number of IST hits is the weighted mean for these interactions.
Figure 2
Figure 2
Relation between the level of study and the degree of proteins in various datasets. Log-log scale linear regression between the number of interactions (in the indicated dataset) involving a protein and the number of papers referencing that protein, using binned data (each point represents 5 proteins). (a) LowBP-LC interactions, R2 = 0.59, P = 2 · 10-103, slope = 0.48. (b) Y2H-Union interactions, R2 = 0.04, P = 1.0 · 10-4, slope = 0.08. (c) Tarassov interactions, R2 = 0.01, P = 0.07, slope = 0.07.
Figure 3
Figure 3
Coverage of LowBP-LC well-studied by each high-throughput dataset. The proportion of LowBP-LC interactions involving well-studied proteins that are covered by each HT dataset is plotted as a function of the 'well-studied cutoff', i.e. the minimum number of papers referencing a protein for it to be considered well-studied.
Figure 4
Figure 4
Number of well-studied proteins. The number of proteins in the well-studied subset is plotted as a function of the well-studied cutoff value. The main figure is restricted to proteins cited in at least 50 papers, while the inset shows the complete graph (starting at one paper). The well-studied cutoff value is the minimum number of papers referencing a protein, for this protein to be considered well-studied.
Figure 5
Figure 5
Estimated size of the yeast interactome. The predicted number of binary physical protein-protein interactions that can occur in S. cerevisiae is plotted as a function of the well-studied cutoff value, using each high-throughput dataset and a CCSB-YI1 FDR of 0.25. The well-studied cutoff value is the minimum number of papers referencing a protein, for this protein to be considered well-studied.
Figure 6
Figure 6
Influence of the CCSB-YI1 FDR on the estimated interactome size. The predicted size of the S. cerevisiae interactome is plotted using each high-throughput dataset, when the CCSB-YI1 FDR ranges from 0.15 to 0.35. The well-studied cutoff (number of papers for a protein to be considered well-studied) is set at 125 papers.

References

    1. Tarassov K, Messier V, Landry CR, Radinovic S, Molina MMS, Shames I, Malitskaya Y, Vogel J, Bussey H, Michnick SW. An in vivo map of the yeast protein interactome. Science. 2008;320:1465–1470. doi: 10.1126/science.1153878. - DOI - PubMed
    1. Han JDJ, Dupuy D, Bertin N, Cusick ME, Vidal M. Effect of sampling on topology predictions of protein-protein interaction networks. Nat Biotechnol. 2005;23:839–844. doi: 10.1038/nbt1116. - DOI - PubMed
    1. Venkatesan K, Rual JF, Vazquez A, Stelzl U, Lemmens I, Hirozane-Kishikawa T, Hao T, Zenkner M, Xin X, Goh KI, Yildirim MA, Simonis N, Heinzmann K, Gebreab F, Sahalie JM, Cevik S, Simon C, de Smet AS, Dann E, Smolyar A, Vinayagam A, Yu H, Szeto D, Borick H, Dricot A, Klitgord N, Murray RR, Lin C, Lalowski M, Timm J, Rau K, Boone C, Braun P, Cusick ME, Roth FP, Hill DE, Tavernier J, Wanker EE, Barabasi AL, Vidal M. An empirical framework for binary interactome mapping. Nat Methods. 2008;6:83–90. doi: 10.1038/nmeth.1280. - DOI - PMC - PubMed
    1. Schwartz AS, Yu J, Gardenour KR, Finley RL Jr, Ideker T. Cost-effective strategies for completing the interactome. Nat Methods. 2009;6:55–61. doi: 10.1038/nmeth.1283. - DOI - PMC - PubMed
    1. Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hon GC, Myers CL, Parsons A, Friesen H, Oughtred R, Tong A, Stark C, Ho Y, Botstein D, Andrews B, Boone C, Troyanskya OG, Ideker T, Dolinski K, Batada NN, Tyers M. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol. 2006;5:11. doi: 10.1186/jbiol36. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances