Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jan 22;105(3):934-9.
doi: 10.1073/pnas.0709671105. Epub 2008 Jan 16.

High-confidence prediction of global interactomes based on genome-wide coevolutionary networks

Affiliations

High-confidence prediction of global interactomes based on genome-wide coevolutionary networks

David Juan et al. Proc Natl Acad Sci U S A. .

Abstract

Interacting or functionally related protein families tend to have similar phylogenetic trees. Based on this observation, techniques have been developed to predict interaction partners. The observed degree of similarity between the phylogenetic trees of two proteins is the result of many different factors besides the actual interaction or functional relationship between them. Such factors influence the performance of interaction predictions. One aspect that can influence this similarity is related to the fact that a given protein interacts with many others, and hence it must adapt to all of them. Accordingly, the interaction or coadaptation signal within its tree is a composite of the influence of all of the interactors. Here, we introduce a new estimator of coevolution to overcome this and other problems. Instead of relying on the individual value of tree similarity between two proteins, we use the whole network of similarities between all of the pairs of proteins within a genome to reassess the similarity of that pair, thereby taking into account its coevolutionary context. We show that this approach offers a substantial improvement in interaction prediction performance, providing a degree of accuracy/coverage comparable with, or in some cases better than, that of experimental techniques. Moreover, important information on the structure, function, and evolution of macromolecular complexes can be inferred with this methodology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Confirmation of the predictions for different steps of the method and for different interaction datasets. (A) Mirrortree. (B) Profile–profile correlation. (C) Partial correlation, 1st level. (D) Partial correlation, 10th level. The x axes represent the number of top predictions (pairs with highest scores), and the y axes represent the confirmation according to the different datasets of protein interactions and relationships.
Fig. 2.
Fig. 2.
Examples of predicted clusters of related proteins. (A) Proteins related to the NADH oxidoreductase complex. (B) Flagellar assembly proteins. The link colors represent levels of coevolutionary specificity: 1st level (red), 5th level (blue), and 10th level (black). The colors of the nodes represent those belonging to the same complex/pathway. (Gray is used to represent unknown/hypothetical proteins, and black is used to represent a false positive.) For the NADH oxidoreductase example, the colors of the surrounding circles represent different structural and functional modules of the complex. More examples are shown in SI Fig. 8.
Fig. 3.
Fig. 3.
Coevolutionary specifity. Different partial correlation specificity levels for two protein pairs. Partial correlation values of different levels of specificity for an unrelated pair of proteins (NudE–PepA, red line) and a positive case involving two proteins, the NADH oxidoreductase complex (NuoF–NuoH, green line).
Fig. 4.
Fig. 4.
Schema of the ContextMirror method. An initial coevolutionary network containing raw tree similarities for all protein pairs is calculated (step 1). The similarity between coevolutionary patterns (vectors containing all of the tree similarities) is calculated for all pairs of proteins (step 2). The specificity of the coevolution between two proteins is evaluated by calculating their partial correlation given all of the others (step 3). The list of partial correlations for each pair of proteins is sorted (step 4). Levels of partial correlation specificity for all of the protein pairs are obtained and ranked (step 5). In all of the steps, only pair relationships with a P value of <10−5 were considered.

References

    1. van Kesteren RE, Tensen CP, Smit AB, van Minnen J, Kolakowski LF, Meyerhof W, Richter D, van Heerikhuizen H, Vreugdenhil E, Geraerts WP. J Biol Chem. 1996;271:3619–3626. - PubMed
    1. Fryxell KJ. Trends Genet. 1996;12:364–369. - PubMed
    1. Goh C-S, Bogan AA, Joachimiak M, Walther D, Cohen FE. J Mol Biol. 2000;299:283–293. - PubMed
    1. Pazos F, Valencia A. Protein Eng. 2001;14:609–614. - PubMed
    1. Pazos F, Ranea JAG, Juan D, Sternberg MJE. J Mol Biol. 2005;352:1002–1015. - PubMed

Publication types

LinkOut - more resources