Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2001 Dec 4;98(25):14512-7.
doi: 10.1073/pnas.251526398.

A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins

Affiliations
Comparative Study

A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins

B Knudsen et al. Proc Natl Acad Sci U S A. .

Abstract

Changes in protein function can lead to changes in the selection acting on specific residues. This can often be detected as evolutionary rate changes at the sites in question. A maximum-likelihood method for detecting evolutionary rate shifts at specific protein positions is presented. The method determines significance values of the rate differences to give a sound statistical foundation for the conclusions drawn from the analyses. A statistical test for detecting slowly evolving sites is also described. The methods are applied to a set of Myc proteins for the identification of both conserved sites and those with changing evolutionary rates. Those positions with conserved and changing rates are related to the structures and functions of their proteins. The results are compared with an earlier Bayesian method, thereby highlighting the advantages of the new likelihood ratio tests.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Assume that a gene duplication has resulted in two protein subfamilies. The first consists of sequences Seq1a and Seq1b, whereas the second includes sequences Seq2a, Seq2b, and Seq2c. (Left) H0, where the rates for a site may differ from one protein subfamily to the other. This rate divergence occurs at the root of the tree, where the duplication event occurred. (Center) The situation under H1. The evolutionary rate for a site remains the same throughout the entire tree. If H1 is rejected, rate shift behavior is present at the position under inspection. If H1 is retained, then one can test whether the rate for this site is equal to the average for both proteins. (Right) The testing of this hypothesis (H2). If H2 is rejected, the evolutionary rate for the site is significantly different from the average for all positions.
Figure 2
Figure 2
The χ2 distribution with one degree of freedom (smooth curve) compared with a simulation study of U (ragged curve). The simulations consisted of 1,000 samples generated under H1, with rates drawn from a gamma distribution. The calculations are based on the phylogeny and ML conditions used in the Myc protein example. The distribution of U approximately follows that of the χ2(1) statistic, especially in the upper part. Other follow-up simulations indicate that this distribution generally conforms more closely to the χ2(1) curve as the two subfamilies increase, both in terms of their branch lengths and numbers of sequences (Figs. 4–7, which are published as supporting information on the PNAS web site).
Figure 3
Figure 3
Summary of results for the 38 Myc proteins, as represented by the c-Myc and N-Myc sequences for human (Homo sapiens), chicken (Gallus gallus), and frog (Xenopus laevis). The full alignment for all 38 Myc sequences is provided in Fig. 9, which is published as supporting information on the PNAS web site. Sites with both blue and red highlighting correspond to those with significant rate differences between the two subfamilies. In these cases, the blue and red distinguish the subfamily with the slower rate from the one with the faster rate, respectively. In turn, sites that are entirely blue or red highlight those with the same rate in the two subfamilies, but with significantly slower or faster rates than the average for all positions, respectively. In all cases, significance refers to the 5% level. Key structural and functional regions of the Myc proteins are labeled above and below the multiple sequence alignment (23, 28, 29). NLS, nuclear localization signal.

Similar articles

Cited by

References

    1. Bork P, Koonin E V. Nat Genet. 1998;18:313–318. - PubMed
    1. Gu X. Mol Biol Evol. 2001;18:453–464. - PubMed
    1. Thornton J M. Science. 2001;292:2095–2097. - PubMed
    1. Yang Z, Bielawski J P. Trends Ecol Evol. 2000;15:496–503. - PMC - PubMed
    1. Suzuki Y, Gojobori T, Nei M. Bioinformatics. 2001;17:660–661. - PubMed

Publication types

LinkOut - more resources