Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Nov 1:7:206.
doi: 10.1186/1471-2148-7-206.

Evaluation of the models handling heterotachy in phylogenetic inference

Affiliations

Evaluation of the models handling heterotachy in phylogenetic inference

Yan Zhou et al. BMC Evol Biol. .

Abstract

Background: The evolutionary rate at a given homologous position varies across time. When sufficiently pronounced, this phenomenon - called heterotachy - may produce artefactual phylogenetic reconstructions under the commonly used models of sequence evolution. These observations have motivated the development of models that explicitly recognize heterotachy, with research directions proposed along two main axes: 1) the covarion approach, where sites switch from variable to invariable states; and 2) the mixture of branch lengths (MBL) approach, where alignment patterns are assumed to arise from one of several sets of branch lengths, under a given phylogeny.

Results: Here, we report the first statistical comparisons contrasting the performance of covarion and MBL modeling strategies. Using simulations under heterotachous conditions, we explore the properties of three model comparison methods: the Akaike information criterion, the Bayesian information criterion, and cross validation. Although more time consuming, cross validation appears more reliable than AIC and BIC as it directly measures the predictive power of a model on 'future' data. We also analyze three large datasets (nuclear proteins of animals, mitochondrial proteins of mammals, and plastid proteins of plants), and find the optimal number of components of the MBL model to be two for all datasets, indicating that this model is preferred over the standard homogeneous model. However, the covarion model is always favored over the optimal MBL model.

Conclusion: We demonstrated, using three large datasets, that the covarion model is more efficient at handling heterotachy than the MBL model. This is probably due to the fact that the MBL model requires a serious increase in the number of parameters, as compared to two supplementary parameters of the covarion approach. Further improvements of the both the mixture and the covarion approaches might be obtained by modeling heterogeneous behavior both along time and across sites.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Topology used for computer simulations. The tree under the newick format is: ((((A:0.375, B:0.3):0.25, C:1):0.08, D:0.32):0.8,((E:0.42, F:0.31):0.24,(G:0.27,(H:0.2,(I:0.5, J:0.5):0.25):0.12):0.25):0.26). Scale bar indicates the expected number of changes per site.
Figure 2
Figure 2
Branch lengths for the two partitions in the case of the mitochondrial alignment of mammals (3591 sites, 17 species). The shape parameter of the Γ distribution was estimated to be 0.4. The weights are 0.40 for component I (B) and 0.60 for component II (A).
Figure 3
Figure 3
Whiskers plot for the average posterior probabilities of component I for the two-component MBL model on the mitochondrial mammal dataset. A Kruskal-Wallis non-parametric test shows the means of posterior probabilities for genes are significantly different (p < 0.0001)
Figure 4
Figure 4
Comparison of branch lengths from the two partitions for the nuclear (A), plastid (B) and mitochondrial (C) alignments. R = 0.63, 0.63 and 0.57 respectively.

References

    1. Felsenstein J. Inferring phylogenies. Sunderland, MA, USA , Sinauer Associates, Inc.; 2004. p. 645.
    1. Yang Z. Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol. 1993;10:1396–1401. - PubMed
    1. Lanave C, Preparata G, Saccone C, Serio G. A new method for calculating evolutionary substitution rates. J Mol Evol. 1984;20:86–93. - PubMed
    1. Galtier N, Gouy M. Inferring phylogenies from DNA sequences of unequal base compositions. Proceedings of the National Academy of Sciences of the USA. 1995;92:11317–11321. - PMC - PubMed
    1. Galtier N. Maximum-likelihood phylogenetic analysis under a covarion-like model. Mol Biol Evol. 2001;18:866–873. - PubMed

Publication types

LinkOut - more resources