Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 7;72(4):820-836.
doi: 10.1093/sysbio/syad015.

Estimation of species divergence times in presence of cross-species gene flow

Affiliations

Estimation of species divergence times in presence of cross-species gene flow

George P Tiley et al. Syst Biol. .

Abstract

Cross-species introgression can have significant impacts on phylogenomic reconstruction of species divergence events. Here, we used simulations to show how the presence of even a small amount of introgression can bias divergence time estimates when gene flow is ignored in the analysis. Using advances in analytical methods under the multispecies coalescent (MSC) model, we demonstrate that by accounting for incomplete lineage sorting and introgression using large phylogenomic data sets this problem can be avoided. The multispecies-coalescent-with-introgression (MSci) model is capable of accurately estimating both divergence times and ancestral effective population sizes, even when only a single diploid individual per species is sampled. We characterize some general expectations for biases in divergence time estimation under three different scenarios: 1) introgression between sister species, 2) introgression between non-sister species, and 3) introgression from an unsampled (i.e., ghost) outgroup lineage. We also conducted simulations under the isolation-with-migration (IM) model and found that the MSci model assuming episodic gene flow was able to accurately estimate species divergence times despite high levels of continuous gene flow. We estimated divergence times under the MSC and MSci models from two published empirical datasets with previous evidence of introgression, one of 372 target-enrichment loci from baobabs (Adansonia), and another of 1000 transcriptome loci from 14 species of the tomato relative, Jaltomata. The empirical analyses not only confirm our findings from simulations, demonstrating that the MSci model can reliably estimate divergence times but also show that divergence time estimation under the MSC can be robust to the presence of small amounts of introgression in empirical datasets with extensive taxon sampling. [divergence time; gene flow; hybridization; introgression; MSci model; multispecies coalescent].

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Species networks used for simulations. a) Gene flow between sister lineages: species A and B diverged at time τt, and introgression occurred from B into A at time τh. b) Gene flow between non-sister lineages (from species D into C at time τh. c) Gene flow from an unsampled ghost lineage O (shown in gray) into species A. Divergence time (τ) is given in units of population size (θ). Population size is constant among all branches. Node names are shown with lower-case letters. The direction of introgression is from node g to h, indicated by the arrow. Simulations under the IM model use the same species trees, but with migration occurring after species divergence at the constant rate of M = Nm migrants per generation.
Figure 2.
Figure 2.
divergence time estimates for speciation nodes simulated under the MSci Model. True values are shown with a dashed horizontal line. Points are posterior means and error bars are 95% highest posterior density (HPD) credible intervals (CIs), both averaged over 10 replicates. The y-axis scale is ×103 and ×102 for when θ=0.001 and θ=0.01, respectively.
Figure 3.
Figure 3.
Population size estimates for speciation nodes simulated under the MSci model. True values are shown with a dashed horizontal line. Points are posterior means and error bars are 95% HPD CIs, averaged over 10 replicates. θs for the non-sister scenario is shown on the log10 scale. The y-axis scale is ×103 and ×102 for when θ=0.001 and θ=0.01, respectively.
Figure 4.
Figure 4.
Divergence time estimates for speciation nodes simulated under the IM model. True values are shown with a dashed horizontal line. Points are posterior means and error bars are 95% HPD CIs, averaged over 10 replicates. The y-axis scale is ×103 and ×102 for when θ=0.001 and θ=0.01, respectively.
Figure 5.
Figure 5.
Population size estimates for speciation nodes simulated under the IM model. True values are shown with a dashed horizontal line. Points are posterior means and error bars are 95% HPD CIs, averaged over 10 replicates. θs for the non-sister scenario and θr for the ghost lineage scenario are shown on the log10 scale. The y-axis scale is ×103and ×102 for when θ=0.001 and θ=0.01, respectively.
Figure 6.
Figure 6.
Divergence time estimates for Adansonia. Node heights are posterior means under the MSci with node-calibrated divergence times, indicated by the black dot. Error bars are 95% HPD CIs. The vertical line with an arrow shows the time, and direction of introgression with the posterior mean and 95% HPD CI of introgression probability displayed. The posterior mean and 95% HPD CI for the Longitubae introgression event are shown on the A. rubrostipa branch. Vertical bars along the right show section names based on the morphological classification of Malagasy baobabs.
Figure 7.
Figure 7.
Divergence time estimates for Jaltomata. node heights are posterior means under the MSci with node-calibrated divergence times. Reticulate edges are shown by black arrows along with their corresponding introgression probabilities from the MSci model. Posterior means and 95% HPD CIs for the ghost introgression donor and recipient nodes are shown above their vertices to improve visibility. Vertical bars along the right show fruit colors among lineages.

References

    1. Ali O.A., O’Rourke S.M., Amish S.J., Meek M.H., Luikart G., Jeffres C., Miller M.R.. 2016. RAD Capture (Rapture): flexible and efficient sequence-based genotyping. Genetics 202:389–400. - PMC - PubMed
    1. Angelis K., dos Reis M.. 2015. The impact of ancestral population size and incomplete lineage sorting on Bayesian estimation of species divergence times. Cur. Zool. 61:874–885.
    1. Barker M.S., Arrigo N., Baniaga A.E., Li Z., Levin D.A.. 2016. On the relative abundance of autopolyploids and allopolyploids. New Phytol. 210:391–398. - PubMed
    1. Barley A.J., Brown J.M., Thomson R.C.. 2018. Impact of model violations on the inference of species boundaries under the multispecies coalescent. Syst. Biol. 67:269–284. - PubMed
    1. Barley A.J., Nieto-Montes de Oca A., Reeder T.W., Manriquez-Moran N.L., Arenas Monroy J.C., Hernandez-Gallegos O., Thomson R.C.. 2019. Complex patterns of hybridization and introgression across evolutionary timescales in Mexican whiptail lizards (Aspidoscelis). Mol. Phylogenet. Evol. 132:284–295. - PubMed

Publication types