Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jan 22;110(4):1160-6.
doi: 10.1073/pnas.1220450110. Epub 2012 Dec 28.

Evolutionary inference via the Poisson Indel Process

Affiliations

Evolutionary inference via the Poisson Indel Process

Alexandre Bouchard-Côté et al. Proc Natl Acad Sci U S A. .

Abstract

We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Depiction of the evolution of a set of strings of nucleotides along the branches of a tree with leaves formula image and root Ω, where each string is subject to insertion, deletion, and substitution processes. Stars denote nucleotide insertion events, crosses denote deletion events, and circles denote substitution events.
Fig. 2.
Fig. 2.
Notation used for describing the PIP. Given a phylogenetic tree τ and a point xτ on that tree, τx is defined as the subtree rooted at x. H. Sapiens, Homo sapiens; M. Fuscata, Macaca fuscata; M. Sylvanus, Macaca sylvanus.
Fig. 3.
Fig. 3.
Example of a PIP sample. Here, Σ has two symbols, represented by red and green squares, and the absorbing deletion symbol ɛ is represented in black. (A) Sample from a Poisson process on τ. (B) Each sampled point corresponds to a rooted tree on which a CTMC path is sampled. (C) Alignments and sequences are obtained as a deterministic function of the first two steps. H. Sapiens, Homo sapiens; M. Fuscata, Macaca fuscata; M. Sylvanus, Macaca sylvanus.
Fig. 4.
Fig. 4.
Relative improvements for enabling each component of the sampler. Arrows on the left are relative alignment improvements, and arrows on the right are relative tree improvements.

References

    1. Thorne JL, Kishino H, Felsenstein J. An evolutionary model for maximum likelihood alignment of DNA sequences. J Mol Evol. 1991;33(2):114–124. - PubMed
    1. Holmes I, Bruno WJ. Evolutionary HMMs: A Bayesian approach to multiple alignment. Bioinformatics. 2001;17(9):803–820. - PubMed
    1. Hein J. An algorithm for statistical alignment of sequences related by a binary tree. Pac Symp Biocomput. 2001;6:179–190. - PubMed
    1. Steel M, Hein J. Applying the Thorne-Kishino-Felsenstein model to sequence evolution on a star-shaped tree. Appl Math Lett. 2001;14(6):679–684.
    1. Metzler D, Fleissner R, Wakolbinger A, von Haeseler A. Assessing variability by joint sampling of alignments and mutation rates. J Mol Evol. 2001;53(6):660–669. - PubMed

Publication types

LinkOut - more resources