Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 May;27(5):1152-61.
doi: 10.1093/molbev/msp324. Epub 2009 Dec 31.

A novel method to detect proteins evolving at correlated rates: identifying new functional relationships between coevolving proteins

Affiliations

A novel method to detect proteins evolving at correlated rates: identifying new functional relationships between coevolving proteins

Nathaniel L Clark et al. Mol Biol Evol. 2010 May.

Abstract

Interacting proteins evolve at correlated rates, possibly as the result of evolutionary pressures shared by functional groups and/or coevolution between interacting proteins. This evolutionary signature can be exploited to learn more about protein networks and to infer functional relationships between proteins on a genome-wide scale. Multiple methods have been introduced that detect correlated evolution using amino acid distances. One assumption made by these methods is that the neutral rate of nucleotide substitution is uniform over time; however, this is unlikely and such rate heterogeneity would adversely affect amino acid distance methods. We explored alternative methods that detect correlated rates using protein-coding nucleotide sequences in order to better estimate the rate of nonsynonymous substitution at each branch (d(N)) normalized by the underlying synonymous substitution rate (d(S)). Our novel likelihood method, which was robust to realistic simulation parameters, was tested on Drosophila nuclear pore proteins, which form a complex with well-documented physical interactions. The method revealed significantly correlated evolution between nuclear pore proteins, where members of a stable subcomplex showed stronger correlations compared with those proteins that interact transiently. Furthermore, our likelihood approach was better able to detect correlated evolution among closely related species than previous methods. Hence, these sequence-based methods are a complementary approach for detecting correlated evolution and could be applied genome-wide to provide candidate protein-protein interactions and functional group assignments using just coding sequences.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.
FIG. 1.
Phylogenetic trees under which the “simple” data set (A) and “realistic” data set (B) were simulated. The simple tree was constructed with the same node-to-node distance for all branches; however, the total tree length matches the realistic tree. Branch lengths on the realistic tree were based on estimates from real alignments. Note the difference in scale between the two trees.
F<sc>IG</sc>. 2.
FIG. 2.
Detecting correlated evolutionary rates using dN/dS ratios. (A) Example species tree over which evolutionary rates are estimated independently for each branch (labeled a–e). (B) dN/dS estimates for one protein are plotted versus another protein for each branch of the species phylogeny. A line of regression shows the relationship between them. (C) Similar logic is employed in the joint likelihood models except that the line of correlation (“corr”) is optimized within the evolutionary model rather than using dN/dS estimates.
F<sc>IG</sc>. 3.
FIG. 3.
Performance on simulated data sets. Each plot compares the correlation coefficient under which the data were simulated (simulated corr. coef.) to the output of a test for correlated evolution. (A) dN/dS point estimate method on “simple” simulated data set. (B) dN/dS point estimate on “realistic” simulated data set. (C) Joint likelihood method as “proportional improvement” on simple data set. (D) Joint likelihood method on realistic data set. Although both methods performed well on the simple data set, the joint likelihood method was less affected by the more realistic simulation parameters.
F<sc>IG</sc>. 4.
FIG. 4.
Power analysis of dN/dS-based methods. Plots of the true-positive versus the false-positive rates (ROC curves) show the sensitivity and specificity of three different tests for correlated evolution on the “simple” (A) and “realistic” (B) simulated data sets. In (A), the dN/dS point estimates method and the proportional improvement statistic performed in a similar way so that their curves overlap.
F<sc>IG</sc>. 5.
FIG. 5.
The dependence of test statistics on gene size. (A) The LRT statistic is more dependent on the total length of the two genes being compared (x axis) than the “proportional improvement” statistic (B). In (B), the gene size is plotted against the absolute value of proportional improvement. These comparisons are between Drosophila nuclear pore proteins and control proteins that are not expected to interact.
F<sc>IG</sc>. 6.
FIG. 6.
Boxplots show the distribution of test statistics for Nup–Nup comparisons (Nups) versus Nup control comparisons (control). For (A), the test statistic is the correlation coefficient “r” on dN/dS point estimates. For (B), the statistic is “proportional improvement” inferred from the joint likelihood models. The bold line represents the median value and box limits are the upper and lower quartiles. The whiskers extend to the most extreme data point outside the box that is no more than 1.5 times the interquartile range. Any data points more extreme are plotted as a circle.
F<sc>IG</sc>. 7.
FIG. 7.
Correlations between specific nuclear pore protein pairs. Each node is a Nup protein identified by its Nup number. Each edge represents the rate correlation between those two proteins with its width reflecting the empirical P value (either 95% or 90%). Nups 75, 96, 107, and 133 participate in the stable Nup 107 subcomplex, whereas Nups 98 and 153 are transient interactors.

References

    1. Bauer DuMont VL, Singh ND, Wright MH, Aquadro CF. Locus-specific decoupling of base composition evolution at synonymous sites and introns along the Drosophila melanogaster and Drosophila sechellia lineages. Genome Biol Evol. 2009;1:67. - PMC - PubMed
    1. Brent MR. Steady progress and recent breakthroughs in the accuracy of automated genome annotation. Nat Rev Genet. 2008;9:62–73. - PubMed
    1. Chamary JV, Parmley JL, Hurst LD. Hearing silence: non-neutral evolution at synonymous sites in mammals. Nat Rev Genet. 2006;7:98–108. - PubMed
    1. Clark NL, Aagaard JE, Swanson WJ. Evolution of reproductive proteins from animals and plants. Reproduction. 2006;131:11–22. - PubMed
    1. Clark NL, Gasper J, Sekino M, Springer SA, Aquadro CF, Swanson WJ. Coevolution of interacting fertilization proteins. PLoS Genet. 2009;5:e1000570. - PMC - PubMed

Publication types

Substances