Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov;17(6):e212-e224.
doi: 10.1111/1755-0998.12686. Epub 2017 May 30.

multi-dice: r package for comparative population genomic inference under hierarchical co-demographic models of independent single-population size changes

Affiliations

multi-dice: r package for comparative population genomic inference under hierarchical co-demographic models of independent single-population size changes

Alexander T Xue et al. Mol Ecol Resour. 2017 Nov.

Abstract

Population genetic data from multiple taxa can address comparative phylogeographic questions about community-scale response to environmental shifts, and a useful strategy to this end is to employ hierarchical co-demographic models that directly test multi-taxa hypotheses within a single, unified analysis. This approach has been applied to classical phylogeographic data sets such as mitochondrial barcodes as well as reduced-genome polymorphism data sets that can yield 10,000s of SNPs, produced by emergent technologies such as RAD-seq and GBS. A strategy for the latter had been accomplished by adapting the site frequency spectrum to a novel summarization of population genomic data across multiple taxa called the aggregate site frequency spectrum (aSFS), which potentially can be deployed under various inferential frameworks including approximate Bayesian computation, random forest and composite likelihood optimization. Here, we introduce the r package multi-dice, a wrapper program that exploits existing simulation software for flexible execution of hierarchical model-based inference using the aSFS, which is derived from reduced genome data, as well as mitochondrial data. We validate several novel software features such as applying alternative inferential frameworks, enforcing a minimal threshold of time surrounding co-demographic pulses and specifying flexible hyperprior distributions. In sum, multi-dice provides comparative analysis within the familiar R environment while allowing a high degree of user customization, and will thus serve as a tool for comparative phylogeography and population genomics.

Keywords: aggregate site frequency spectrum; approximate Bayesian computation; comparative phylogeography; population genetics software; random forest.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Hierarchical co‐demographic models. (a) Example instantaneous co‐expansion model. (b) Example instantaneous co‐contraction model. Both models are such that eight of the ten taxa are assigned to three synchronous co‐demographic pulses (ѱ = 3; ζT = 0.8), with the first pulse containing three taxa (ζ1 = 0.3), the second pulse containing another two taxa (ζ2 = 0.2) and the third pulse containing yet another three taxa (ζ3 = 0.3). Pulse 1 occurs at the most recent time (τs,1), pulse 2 occurs at the intermediate time (τs,2), and pulse 3 occurs at the most ancient time (τs,3). The remaining two taxa are then behaving idiosyncratically in time from all other taxa (τi,1 and τi,2). Each taxon is allowed nuisance demographic parameter draws independent from each other ({ε1, …, ε10} and {N 1, …, N 10})
Figure 2
Figure 2
Flowchart of multi‐dice usage. multi‐dice accomplishes multi‐taxa co‐demographic inference under a hierarchical model through three major steps: model specification, single‐population simulation across multiple taxa and conversion of simulated data to multi‐taxa summary statistics. Hierarchical co‐demographic model specification is conducted across multiple functions in sequence, with preceding functions contained within successive functions. This sequential embedding of functions extends to dice.sims(), allowing the entire model specification process to be performed concurrently with data simulation. Simulated data can then be converted to multi‐taxa summary statistics by either dice.aSFS() or dice.sumstats(), depending on the data type. Additionally, these functions can be applied to empirical data as well. To clarify, only two multi‐dice functions/command lines, dice.sims() and dice.aSFS()/dice.sumstats(), are needed for simplest usage to construct a reference table of multi‐taxa summary statistics under a hierarchical co‐demographic model. This reference table can then be exploited in a downstream software program for hRF or hABC purposes, where appropriate statistical practices should be used to examine robustness and fit. Importantly, exploratory analyses should be performed on the empirical data prior to deploying multi‐dice to better guide its usage, for example, to determine sensible prior distributions and evaluate differences among taxa

Similar articles

Cited by

References

    1. Aldous, D. J. (1985). Exchangeability and related topics. Berlin, Heidelberg: Springer.
    1. Arbogast, B. S. , & Kenagy, G. J. (2001). Comparative phylogeography as an integrative approach to historical biogeography. Journal of Biogeography, 28, 819–825.
    1. Avise, J. C. (2000). Phylogeography: The history and formation of species (p. 447). Cambridge, MA: Harvard University Press.
    1. Beaumont, M. A. (2010). Approximate Bayesian computation in evolution and ecology. Annual Review of Ecology, Evolution, and Systematics, 41, 379–406.
    1. Bertorelle, G. , Benazzo, A. , & Mona, S. (2010). ABC as a flexible framework to estimate demography over space and time: Some cons, many pros. Molecular Ecology, 19, 2609–2625. - PubMed