Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Nov;3(11):e214.
doi: 10.1371/journal.pcbi.0030214. Epub 2007 Sep 21.

Where have all the interactions gone? Estimating the coverage of two-hybrid protein interaction maps

Affiliations

Where have all the interactions gone? Estimating the coverage of two-hybrid protein interaction maps

Hailiang Huang et al. PLoS Comput Biol. 2007 Nov.

Abstract

Yeast two-hybrid screens are an important method for mapping pairwise physical interactions between proteins. The fraction of interactions detected in independent screens can be very small, and an outstanding challenge is to determine the reason for the low overlap. Low overlap can arise from either a high false-discovery rate (interaction sets have low overlap because each set is contaminated by a large number of stochastic false-positive interactions) or a high false-negative rate (interaction sets have low overlap because each misses many true interactions). We extend capture-recapture theory to provide the first unified model for false-positive and false-negative rates for two-hybrid screens. Analysis of yeast, worm, and fly data indicates that 25% to 45% of the reported interactions are likely false positives. Membrane proteins have higher false-discovery rates on average, and signal transduction proteins have lower rates. The overall false-negative rate ranges from 75% for worm to 90% for fly, which arises from a roughly 50% false-negative rate due to statistical undersampling and a 55% to 85% false-negative rate due to proteins that appear to be systematically lost from the assays. Finally, statistical model selection conclusively rejects the Erdös-Rényi network model in favor of the power law model for yeast and the truncated power law for worm and fly degree distributions. Much as genome sequencing coverage estimates were essential for planning the human genome sequencing project, the coverage estimates developed here will be valuable for guiding future proteomic screens. All software and datasets are available in and , -, and -, and are also available from our Web site, http://www.baderzone.org.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Flowchart for Yeast Two-Hybrid Screens Indicates Systematic and Stochastic Sources of False Negatives and Stochastic Sources of False Positives
Figure 2
Figure 2. Simplified Schematic Shows the Two-Hybrid Sampling Process
In this picture, true-positive interactions (black edges) are sampled uniformly with total probability 1 − α, and false-positive interactions (red edges) are sampled stochastically with total probability 1 − α. Sampling is with replacement, and multiple edges between a pair of vertices represent multiple observations of the same interaction. The example shows n = 12 edges sampled in the entire network, with w = 11 unique edges and s = 10 edges that are singletons observed once. The total number of true-positive edges, k, and the number of false-positive edges within the sample, f, are hidden. The actual experimental data is more complicated, with individual values reported for n, w, and s for each protein used as a bait. The statistical method presented here provides estimates for k and f together with parameter estimates for α and the distribution Pr(k).
Figure 3
Figure 3. Number of Unique Interactions (w) and Singleton Interactions (s) Calculated as a Function of the Number of Preys Examined for the Experimental Data (Points)
Extrapolations based on half the data are provided for yeast, worm, and fly based on the TPL-MIXTURE model obtained for each.

References

    1. Phizicky E, Bastiaens PI, Zhu H, Snyder M, Fields S. Protein analysis on a proteomic scale. Nature. 2003;422:208–215. - PubMed
    1. Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, et al. A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae. Nature. 2000;403:623–627. - PubMed
    1. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A. 2001;98:4569–4574. - PMC - PubMed
    1. Li S, Armstrong CM, Bertin N, Ge H, Milstein S, et al. A map of the interactome network of the metazoan C. elegans. Science. 2004;303:540–543. - PMC - PubMed
    1. Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, et al. A protein interaction map of Drosophila melanogaster. Science. 2003;302:1727–1736. - PubMed

Publication types

MeSH terms