Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 May;25(9):1911-24.
doi: 10.1111/mec.13586. Epub 2016 Apr 20.

Inferences from tip-calibrated phylogenies: a review and a practical guide

Affiliations
Review

Inferences from tip-calibrated phylogenies: a review and a practical guide

Adrien Rieux et al. Mol Ecol. 2016 May.

Abstract

Molecular dating of phylogenetic trees is a growing discipline using sequence data to co-estimate the timing of evolutionary events and rates of molecular evolution. All molecular-dating methods require converting genetic divergence between sequences into absolute time. Historically, this could only be achieved by associating externally derived dates obtained from fossil or biogeographical evidence to internal nodes of the tree. In some cases, notably for fast-evolving genomes such as viruses and some bacteria, the time span over which samples were collected may cover a significant proportion of the time since they last shared a common ancestor. This situation allows phylogenetic trees to be calibrated by associating sampling dates directly to the sequences representing the tips (terminal nodes) of the tree. The increasing availability of genomic data from ancient DNA extends the applicability of such tip-based calibration to a variety of taxa including humans, extinct megafauna and various microorganisms which typically have a scarce fossil record. The development of statistical models accounting for heterogeneity in different aspects of the evolutionary process while accommodating very large data sets (e.g. whole genomes) has allowed using tip-dating methods to reach inferences on divergence times, substitution rates, past demography or the age of specific mutations on a variety of spatiotemporal scales. In this review, we summarize the current state of the art of tip dating, discuss some recent applications, highlight common pitfalls and provide a 'how to' guide to thoroughly perform such analyses.

Keywords: Bayesian phylogenetics; calibration, divergence time and substitution rate inferences; measurably evolving populations; population dynamics; tip-dating.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Tip‐dating principle. (a) In this simplified theoretical situation adapted from Rambaut (2000), sequences A and B were isolated at different points in time (TA and TB, respectively) and C is an outgroup sequence. If we assume the rate of evolution to be the same in lineages A and B, then the amount of molecular evolution expected to have occurred between TA and TB is equal to dACdBC (dAC and dBC being the genetic distance between A&C and B&C, respectively). If the time X between TA and TB represents a significant proportion of the time Y since A and B last shared a common ancestor, then one can use tip dates to conjointly estimate the rate of evolution μ = (ACBC)/(TATB) and extrapolate the age of T MRCA(AB). (b) Top: Tree with modern samples only for which no divergence time estimate is possible without calibrations on internal nodes or a strong prior on the rate of molecular clock. Middle: Tree where tip dates may not be widely spread enough for accurate inferences. Bottom: Tree where tip date width should be sufficiently broad to allow divergence time and rate of evolution estimates with a good degree of certainty, since the sample dates cover a relatively large fraction of the total age of the tree.
Figure 2
Figure 2
Testing for temporal signal. Flow chart for testing measurable evolutionary change in a data set prior to any tip‐dating analysis. The most robust method existing so far is the ‘date‐randomization test’ which involves generating multiple randomized data sets by permutation of sampling times, and comparing parameter estimates obtained with the initial data set vs the randomized ones (see Section To date or not to date when is tip dating appropriate? in the text for more details on how to perform this test and interpret the results); visual evidence for a temporal signal can also be obtained by fitting a linear regression between the age of the samples and their root‐to‐tip distances, which has to be computed from a tree built without constraining tip heights to their sampling times. Different tools allowing computing date‐randomized data sets and root‐to‐tip distances are listed in Table 1.
Figure 3
Figure 3
Transmission graph vs. Phylogenetic tree. This figure adapted from Jombart et al. (2011) illustrates the difference between a transmission chain and a phylogenetic reconstruction. Panel a represents the transmission chain of a pathogen as arrows connecting hosts represented as circles, with grey circles representing sampled hosts. In panel (a, b) transmission graph (or network) is correctly reconstructed from the sampled hosts. In panel (c), a time‐structured phylogeny is reconstructed using the same samples with black dots representing hypothetical ancestral isolates.
Figure 4
Figure 4
Different statistical distributions to model uncertainty in tip calibrations inferences. Different distributions can be used to model the error associated with sampling dates. Choosing the best‐suited one depends on the type of sample and the information associated with the dating method (Ho & Phillips 2009). Point values (a) can be used if the age of a sample is exactly known (e.g. sampling date). Modelling radiocarbon dating errors with a normal distribution (b) is common practice in ancient DNA studies even though recent improvement allow to use empirical description of the probability density function directly measured on the calibrated sample (c) (see Molak et al. 2015 for more details on this topic). Uniform distributions with hard minimum and maximum bounds (d) are suited to samples obtained from a well‐defined stratum [e.g. ancient DNA retrieved from ice cores (Willerslev et al. 2007) or from samples associated with archaeological horizons (Edwards et al. 2007)] or to model uncertainty in sampling time accuracy (e.g. if the sampling month is known for some samples but not for others). Finally, uniform distribution with hard minimum and soft maximum bounds (e) can be suited to ancient DNA samples beyond the 45–50 ka resolution limit of radiocarbon dating (thus yielding a minimum age) for which additional information (e.g. from fossil data) exists and justifies the use of a soft maximum bound. This figure is adapted from Ho & Duchêne (2014).
Figure 5
Figure 5
Major steps to conduct accurate tip dating. This figure summarizes the five main steps that ought to be conducted when performing tip‐dating analyses. For each of those steps, additional advices such as the important choices that must be made or the software to be used are given in the form of a practical guide available in Appendix S1 (Supporting information).

Comment in

References

    1. Alter SE, Meyer M, Post K et al (2015) Climate impacts on transocean dispersal and habitat in gray whales from the Pleistocene to 2100. Molecular Ecology, 24, 1510–1522. - PubMed
    1. Arcila D, Pyron RA, Tyler JC, Orti G, Betancur‐R R (2015) An evaluation of fossil tip‐dating versus node‐age calibrations in tetraodontiform fishes (Teleostei: Percomorphaceae). Molecular Phylogenetics and Evolution, 82, 131–145. - PubMed
    1. Axelsson E, Willerslev E, Gilbert MTP, Nielsen R (2008) The effect of ancient DNA damage on inferences of demographic histories. Molecular Biology and Evolution, 25, 2181–2187. - PubMed
    1. Baele G, Lemey P, Bedford T et al (2012) Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Molecular Biology and Evolution, 29, 2157–2167. - PMC - PubMed
    1. Bauer E, Falque M, Walter H et al (2013) Intraspecific variation of recombination rate in maize. Genome Biology, 14, R103. - PMC - PubMed

Publication types

LinkOut - more resources