Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Oct;1842(10):1971-1980.
doi: 10.1016/j.bbadis.2014.05.028. Epub 2014 Jun 2.

Protein-protein interactions and genetic diseases: The interactome

Affiliations
Review

Protein-protein interactions and genetic diseases: The interactome

Kasper Lage. Biochim Biophys Acta. 2014 Oct.

Abstract

Protein-protein interactions mediate essentially all biological processes. Despite the quality of these data being widely questioned a decade ago, the reproducibility of large-scale protein interaction data is now much improved and there is little question that the latest screens are of high quality. Moreover, common data standards and coordinated curation practices between the databases that collect the interactions have made these valuable data available to a wide group of researchers. Here, I will review how protein-protein interactions are measured, collected and quality controlled. I discuss how the architecture of molecular protein networks has informed disease biology, and how these data are now being computationally integrated with the newest genomic technologies, in particular genome-wide association studies and exome-sequencing projects, to improve our understanding of molecular processes perturbed by genetics in human diseases. This article is part of a Special Issue entitled: From Genome to Function.

Keywords: Complex human disease; Genetics and proteomics; Protein–protein interaction.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Properties of protein-protein interaction networks
The properties of random networks (A), scale-free networks (B), and hierarchical networks (C). Protein networks are also called scale free, because it is not possible to define a meaningful average node in these networks. Plotting the degree k of nodes in protein interaction networks against the probability of observing that degree P(k), follows a power law (Bb). In these networks the clustering coefficient C(k) does not change as the function of the nodes degree (Bc), meaning that nodes with few interactions and a lot of interactions alike tend to participate in highly connected topological modules in the network. These properties are different for random networks (Aa, Ab, Ac) where edges are randomly distributed across nodes, and hierarchical networks (Ca, Cb, Cc), where clusters are united in an iterative manner. Figure is reproduced from with permission.
Figure 2
Figure 2. The modular organization of protein-protein interaction networks
Protein interaction networks have topological modules in which proteins are more connected to each other than to the reset of the network (a). These represent genes in the same pathways, molecular machines, or rigid architectural structures, i.e., functional modules (b). This has implications for human disease biology, as genes involved in the same disease tend to fall into the same clusters or functional modules. Modules enriched for genes from a particular disease are termed disease modules (c). Figure is reproduced from with permission.
Figure 3
Figure 3. Using protein complexes to prioritize genes in linkage intervals
First, a virtual pull-down of each candidate gene is executed by querying a protein interaction network for interactors of the candidate. Each complex is named the candidate complex. Second, proteins for which the corresponding gene is known to be involved in a disease are identified in the candidate complex, and the phenotypic similarity of diseases represented in the complex and the disease related to the linkage interval are compared using a computational phenotype similarity score. In this case, proteins that are involved in different disorders comparable to Leber congenital amaurosis are colored according to their clinical overlap with this disease. The last step involves scoring and ranking the candidates by the Bayesian predictor. Each candidate is scored based on phenotypes associated with the proteins in the candidate complex, and all candidates in the interval are ranked based on this score. Figure is reproduced from with permission.
Figure 4
Figure 4. Augmenting and interpreting GWAS data using protein-protein interaction networks
The systematic integrating GWAS loci, individual type-2-diabetes related genes, and protein interaction networks, revealed a CREBBP network (a) and an Adipocytokine network (b). Figure reproduced from with permission.
Figure 5
Figure 5. De novo exome mutations reveals a significant chromatin remodeling network in autism spectrum disorders
Genes that harbor de novo mutations in patients with sporadic autism spectrum disorders, significantly interact at the level of proteins, revealing a chromatin remodeling network (the sub network including SMARCC2). Proteins are colored based on the significance of their interactions with other proteins in which de novo mutations were found as determined by the DAPPLE algorithm using protein interactions from InWeb. Figure is reproduced from with permission.

Similar articles

Cited by

References

    1. Lander ES, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. - PubMed
    1. Venter JC, et al. The sequence of the human genome. Science. 2001;291:1304–51. - PubMed
    1. Altshuler DM, et al. Integrating common and rare genetic variation in diverse human populations. Nature. 2010;467:52–8. - PMC - PubMed
    1. Abecasis GR, et al. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–73. - PMC - PubMed
    1. Mailman MD, et al. The NCBI dbGaP database of genotypes and phenotypes. Nat Genet. 2007;39:1181–6. - PMC - PubMed