Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan;59(1):27-41.
doi: 10.1093/sysbio/syp076. Epub 2009 Nov 9.

Unifying vertical and nonvertical evolution: a stochastic ARG-based framework

Affiliations

Unifying vertical and nonvertical evolution: a stochastic ARG-based framework

Erik W Bloomquist et al. Syst Biol. 2010 Jan.

Abstract

Evolutionary biologists have introduced numerous statistical approaches to explore nonvertical evolution, such as horizontal gene transfer, recombination, and genomic reassortment, through collections of Markov-dependent gene trees. These tree collections allow for inference of nonvertical evolution, but only indirectly, making findings difficult to interpret and models difficult to generalize. An alternative approach to explore nonvertical evolution relies on phylogenetic networks. These networks provide a framework to model nonvertical evolution but leave unanswered questions such as the statistical significance of specific nonvertical events. In this paper, we begin to correct the shortcomings of both approaches by introducing the "stochastic model for reassortment and transfer events" (SMARTIE) drawing upon ancestral recombination graphs (ARGs). ARGs are directed graphs that allow for formal probabilistic inference on vertical speciation events and nonvertical evolutionary events. We apply SMARTIE to phylogenetic data. Because of this, we can typically infer a single most probable ARG, avoiding coarse population dynamic summary statistics. In addition, a focus on phylogenetic data suggests novel probability distributions on ARGs. To make inference with our model, we develop a reversible jump Markov chain Monte Carlo sampler to approximate the posterior distribution of SMARTIE. Using the BEAST phylogenetic software as a foundation, the sampler employs a parallel computing approach that allows for inference on large-scale data sets. To demonstrate SMARTIE, we explore 2 separate phylogenetic applications, one involving pathogenic Leptospirochete and the other Saccharomyces.

PubMed Disclaimer

Figures

F<sc>IGURE</sc> 1.
FIGURE 1.
Nonvertical evolution confirmation and event dating. The figure shows most probable ARG that represents the evolutionary history of 9 members of the lenfamily in Leptospira interrogans. For each of the taxa, the first letter represents the gene, for example, CNrepresents the lenCgene and DNrepresents the lenDgene. The second letter in each taxa describes whether the taxa represents the C-terminal or the N-terminal of the particular gene, for example, CNderives from the N-terminal of the lenCgene. The lenAgene only has an N-terminal. We abbreviate the L. interroganslineages as Hardjo (har), Grippotyphosa (gri), Canicola (can), Bratislava (bra), Pomona (pom), Copenhageni (cop), and Lai (lai). The white circles on the ARG represent bifurcation nodes, and the black circles represent nonvertical nodes. The dashed lines represent edges involved in a nonvertical event; the remaining solid lines represent edges not involved in a nonvertical event. The figure displays 95% credible intervals for the height of each node in parentheses in expected number of substitutions. We display confidence intervals for the heights of nodes (1,2,3) near the root of the ARG. SMARTIE provides a 90% posterior probability for this history. If we had used alternative gene-tree incongruence procedures, significance statements like this would not be possible.
F<sc>IGURE</sc> 2.
FIGURE 2.
Hybridization in Saccharomyces. Figures (a) and (b) represent the 2 most common gene trees in the Saccharomycesdata set taken from Rokas et al. (2003). The taxa in all 3 figures, Saccharomyces cervevisiae(Scer), Saccharomyces paradoxus(Spar), Saccharomyces mikatae(Smik), Saccharomyces bayanus(Sbay), Saccharomyces kudriavzevii(Skud), and Saccharomyces castellii(Scas), represent distinct species of Saccharomyces. Under SMARTIE, 31 genes on average support the gene tree in (a) and 75 support the gene tree in (b). Because neither bifurcating species history garners overwhelming support, we believe that speciation leading to Sbayand Skudexhibits a strong signal toward hybridization as depicted by the ARG in (c).
F<sc>IGURE</sc> A1.
FIGURE A1.
ARG swap transition kernel. This figure displays the operations that the ARG swap transition kernel makes. a) The bifurcation swap. In this example, the kernel selects the bifurcation node marked with an arrow and the right lineage of this bifurcation node. Next, the kernel breaks the 4 lineages marked with the numbers 1, 2, 3, and 4. Finally, the kernel swaps lineages 1 and 2 and reattaches the graph as indicated by the dashed lines. The nonvertical swap displayed in (b) acts similarly.
F<sc>IGURE</sc> A2.
FIGURE A2.
Reversible jump transition kernel. This figure demonstrates the add step of the reversible jump kernel. First, the kernel draws 2 heights t1and t2on G. Next, the kernel uniformly chooses 1 of 4 lineages at t1(marked with arrows for illustrative purposes) and 1 of 5 lineages at t2. After selecting the lineages, the kernel adds a bifurcation node onto lineage at t1and a nonvertical node onto the lineage at t2. Next, the kernel decides whether to link up the right side or left side of the new nonvertical node to the new bifurcation node; in the figure, the kernel links up the right side of the nonvertical node to the new bifurcation through the dashed line. As a last step, the kernel links up the remaining pieces of the graph.

Similar articles

Cited by

References

    1. Åkerborg O, Sennblad B, Arvestad L, Lagergren J. Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. Proc. Natl. Acad. Sci. USA. 2009;106:5714–5719. - PMC - PubMed
    1. Altekar G, Dwarkadas S, Huelsenbeck J, Ronquist F. Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference. Bioinformatics. 2004;20:407–415. - PubMed
    1. Amadal G. AFIPS Conference Proceedings. New York: ACM; 1967. Validity of the single processor approach to achieving large-scale computing capabilities; pp. 483–485.
    1. Andersson J, Sjögren A, Davis L, Embley T, Roger A. Phylogenetic analyses of diplomonad genes reveal frequent lateral gene transfers affecting eukaryotes. Curr. Biol. 2003;13:94–104. - PubMed
    1. Ané C, Larget B, Baum D, Smith S, Rokas A. Bayesian estimation of concordance among gene trees. Mol. Biol. Evol. 2007;24:412–426. - PubMed

Publication types