Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 May 14:15:86.
doi: 10.1186/s12862-015-0364-7.

Utility of characters evolving at diverse rates of evolution to resolve quartet trees with unequal branch lengths: analytical predictions of long-branch effects

Affiliations

Utility of characters evolving at diverse rates of evolution to resolve quartet trees with unequal branch lengths: analytical predictions of long-branch effects

Zhuo Su et al. BMC Evol Biol. .

Abstract

Background: The detection and avoidance of "long-branch effects" in phylogenetic inference represents a longstanding challenge for molecular phylogenetic investigations. A consequence of parallelism and convergence, long-branch effects arise in phylogenetic inference when there is unequal molecular divergence among lineages, and they can positively mislead inference based on parsimony especially, but also inference based on maximum likelihood and Bayesian approaches. Long-branch effects have been exhaustively examined by simulation studies that have compared the performance of different inference methods in specific model trees and branch length spaces.

Results: In this paper, by generalizing the phylogenetic signal and noise analysis to quartets with uneven subtending branches, we quantify the utility of molecular characters for resolution of quartet phylogenies via parsimony. Our quantification incorporates contributions toward the correct tree from either signal or homoplasy (i.e. "the right result for either the right reason or the wrong reason"). We also characterize a highly conservative lower bound of utility that incorporates contributions to the correct tree only when they correspond to true, unobscured parsimony-informative sites (i.e. "the right result for the right reason"). We apply the generalized signal and noise analysis to classic quartet phylogenies in which long-branch effects can arise due to unequal rates of evolution or an asymmetrical topology. Application of the analysis leads to identification of branch length conditions in which inference will be inconsistent and reveals insights regarding how to improve sampling of molecular loci and taxa in order to correctly resolve phylogenies in which long-branch effects are hypothesized to exist.

Conclusions: The generalized signal and noise analysis provides analytical prediction of utility of characters evolving at diverse rates of evolution to resolve quartet phylogenies with unequal branch lengths. The analysis can be applied to identifying characters evolving at appropriate rates to resolve phylogenies in which long-branch effects are hypothesized to occur.

PubMed Disclaimer

Figures

Figure 1
Figure 1
An unrooted four-taxon tree in an ultrametric form, with an internode of length (in time) t 0 and four subtending branches of lengths (in time) T 1, T 2, T 3, and T 4. The ancestral states of a molecular character at the two ends of the internode are denoted as M and N. The character states at the terminal tips of the four subtending branches are denoted as C 1, C 2, C 3, and C 4. The average substitution rate of the character over the internode and the four subtending branches is denoted as λ 0, λ 1, λ 2, λ 3, and λ 4. The expected number of character state changes in the internode and the four subtending branches are thus given by λ 0 t 0, λ 1 T 1, λ 2 T 2, λ 3 T 3, and λ 4 T 4, respectively.
Figure 2
Figure 2
Two classic quartet branch length conditions in which long-branch effects can arise. A) Four-taxon tree modeled by Huelsenbeck and Hillis [22]. The internode and two subtending branches labeled a are constrained to have the same length (i.e. “three-branch length”), so are the two subtending branches labeled b (i.e. “two-branch length”); p a and p b represent the three-branch length and two-branch length (evaluated via Equation 9), respectively. B) Alternative four-taxon tree modeled by Siddall [26]. The internode and two subtending branches labeled a’ are constrained to be equal in length (i.e. “alternative three-branch length”), so are the two subtending branches labeled b’ (i.e. “alternative two-branch length”), with p a and p b representing the alternative three-branch length and two-branch length, respectively. C) Branch length space of the model tree investigated by Huelsenbeck and Hillis [22], with the three-branch length p a on the horizontal axis and the two-branch length p b on the vertical axis. These axes apply to Figures 3A, C, and E. The upper-left portion of this branch length space corresponds to the Felsenstein zone. D) Branch length space of the alternative model tree investigated by Siddall [26], with the alternative three-branch length p a on the horizontal axis and the alternative two-branch length p b on the vertical axis. These axes correspond to those in Figures 3B, D, and F. The upper-left portion of this branch length space corresponds to the Farris zone as termed by Siddall [26].
Figure 3
Figure 3
Contour map of y ∕ Max(x 1, x 2) for a nucleotide character which assumes the JC model over the branch length space of A) the Huelsenbeck and Hillis [22] model tree and B) the Siddall [26] model tree, with contour lines of y ∕ Max(x 1, x 2) = 1/10, 1/6, 1/4, 1/2, 1 (thick dashed), 2, 4, 6, and 10 shown if present within the respective branch length space. Contour map of Πy for a nucleotide character under the JC model over the branch length space of C) the Huelsenbeck and Hillis [22] model tree and D) the Siddall [26] model tree, with contour lines of Πy = 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0 (thick dashed) shown if present. Contour map of Π ∕ Max(x 1, x 2) for a nucleotide character under the JC model over the branch length space of E) the Huelsenbeck and Hillis [22] model tree and F) the Siddall [26] model tree, with contour lines of Π ∕ Max(x 1, x 2) = 1/10, 1/6, 1/4, 1/2, 1 (thick dashed), 2, 4, 6, and 10 shown if present.
Figure 4
Figure 4
The predicted utility Π − Max(x 1, x 2) versus substitution rate λ based on the JC model is plotted for l = 1.5 (solid line), l = 2 (dotted line), l = 2.5 (dashed line), and l = 3 (dot-dashed line), for the four-taxon tree as depicted in Figure 1 in which λ 0 = λ 1 = λ 2 = λ 3 = λ 4 = λ, t 0 = 0.1, T 1 = T 3 = 0.4, and T 2 = T 4 = 0.4l.
Figure 5
Figure 5
The predicted utility Π − Max(x 1, x 2) versus substitution rate λ based on the JC model is plotted for l = 1.5 (solid line), l = 2 (dotted line), l = 2.5 (dashed line), and l = 3 (dot-dashed line), for the four-taxon tree as depicted in Figure 1 in which t 0 = 0.1, T 1 = T 2 = T 3 = T 4 = lt 0, λ 1 = λ 3 = 1, and λ 0 = λ 2 = λ 4 = λ.
Figure 6
Figure 6
The predicted utility Π − Max(x 1, x 2) versus substitution rate λ is plotted based on the JC [63] model (solid line), the K2P (dotted line), the HKY (dashed line), and the GTR model (dot-dashed line), for the four-taxon tree as depicted in Figure 1 in which t 0 = 0.1, T 1 = T 2 = T 3 = T 4 = 0.25, λ 1 = λ 3 = 1, and λ 0 = λ 2 = λ 4 = λ. The model parameter values are presented in Table 1.

Similar articles

Cited by

References

    1. Felsenstein J. Cases in which parsimony or compatibility methods will be positively misleading. Syst Zool. 1978;27:401–10. doi: 10.2307/2412923. - DOI
    1. Hendy MD, Penny D. A framework for the quantitative study of evolutionary trees. Syst Zool. 1989;38:297–309. doi: 10.2307/2992396. - DOI
    1. Kim JH. General inconsistency conditions for maximum parsimony: effects of branch lengths and increasing numbers of taxa. Syst Biol. 1996;45:363–74. doi: 10.1093/sysbio/45.3.363. - DOI
    1. Sanderson MJ, Wojciechowski MF, Hu JM, Khan TS, Brady SG. Error, bias, and long-branch attraction in data for two chloroplast photosystem genes in seed plants. Mol Biol Evol. 2000;17:782–97. doi: 10.1093/oxfordjournals.molbev.a026357. - DOI - PubMed
    1. Andersson FE, Swofford DL. Should we be worried about long-branch attraction in real data sets? Investigations using metazoan 18S rDNA. Mol Phyl Evol. 2004;33:440–51. doi: 10.1016/j.ympev.2004.06.015. - DOI - PubMed

LinkOut - more resources