Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 May 13;105(19):6959-64.
doi: 10.1073/pnas.0708078105. Epub 2008 May 12.

Estimating the size of the human interactome

Affiliations

Estimating the size of the human interactome

Michael P H Stumpf et al. Proc Natl Acad Sci U S A. .

Abstract

After the completion of the human and other genome projects it emerged that the number of genes in organisms as diverse as fruit flies, nematodes, and humans does not reflect our perception of their relative complexity. Here, we provide reliable evidence that the size of protein interaction networks in different organisms appears to correlate much better with their apparent biological complexity. We develop a stable and powerful, yet simple, statistical procedure to estimate the size of the whole network from subnet data. This approach is then applied to a range of eukaryotic organisms for which extensive protein interaction data have been collected and we estimate the number of interactions in humans to be approximately 650,000. We find that the human interaction network is one order of magnitude bigger than the Drosophila melanogaster interactome and approximately 3 times bigger than in Caenorhabditis elegans.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Performance of the estimator, Eq. 8, for the yeast network. Here, the DIP dataset was taken as a gold-standard “true” interaction network. (A) True network size (red bars) and histograms of predicted sizes for subnets that were created by sampling 20%, 40%, 60%, and 80% of nodes with equal probability. (B) Fraction of estimates obtained from 1,000 independent subnets (covering 20%, 40%. 60%, and 80% of the nodes in the true network) where the empirical 95% bootstrap confidence interval (based on 1,000 replicates) contains the true value (green).
Fig. 2.
Fig. 2.
Estimated interactome sizes for humans and three other eukaryotic species for which high-throughput interaction data are available. The letters denote the approximate position of the point estimate, 𝒩, and the horizontal bars indicate the range of the approximate 95% CIs (obtained from 10,000 bootstrap replicates; see SI Text for details). (The yeast and human datasets are largely independent but there is large overlap between the datasets for D. melanogaster and especially C. elegans.)

Comment in

  • A truer measure of our ignorance.
    Amaral LA. Amaral LA. Proc Natl Acad Sci U S A. 2008 May 13;105(19):6795-6. doi: 10.1073/pnas.0802459105. Epub 2008 May 12. Proc Natl Acad Sci U S A. 2008. PMID: 18474865 Free PMC article. No abstract available.

References

    1. Lander E, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. - PubMed
    1. Venter J, et al. The sequence of the human genome. Science. 2001;291:1304–1351. - PubMed
    1. Copley R. The animal in the genome: comparative genomics and evolution. Philos Trans R Soc London Ser B. 2008;363:1453–1461. - PMC - PubMed
    1. Tian B, Pan Z, Lee JY. Widespread mRNA polyadenylation events in introns indicate dynamic interplay between polyadenylation and splicing. Genome Res. 2007;17:156–165. - PMC - PubMed
    1. Henikoff S. Histone modifications: Combinatorial complexity or cumulative simplicity? Proc Natl Acad Sci USA. 2005;102:5308–5309. - PMC - PubMed

Publication types