Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jul 28:11:401.
doi: 10.1186/1471-2105-11-401.

Inference on population history and model checking using DNA sequence and microsatellite data with the software DIYABC (v1.0)

Affiliations

Inference on population history and model checking using DNA sequence and microsatellite data with the software DIYABC (v1.0)

Jean-Marie Cornuet et al. BMC Bioinformatics. .

Abstract

Background: Approximate Bayesian computation (ABC) is a recent flexible class of Monte-Carlo algorithms increasingly used to make model-based inference on complex evolutionary scenarios that have acted on natural populations. The software DIYABC offers a user-friendly interface allowing non-expert users to consider population histories involving any combination of population divergences, admixtures and population size changes. We here describe and illustrate new developments of this software that mainly include (i) inference from DNA sequence data in addition or separately to microsatellite data, (ii) the possibility to analyze five categories of loci considering balanced or non balanced sex ratios: autosomal diploid, autosomal haploid, X-linked, Y-linked and mitochondrial, and (iii) the possibility to perform model checking computation to assess the "goodness-of-fit" of a model, a feature of ABC analysis that has been so far neglected.

Results: We used controlled simulated data sets generated under evolutionary scenarios involving various divergence and admixture events to evaluate the effect of mixing autosomal microsatellite, mtDNA and/or nuclear autosomal DNA sequence data on inferences. This evaluation included the comparison of competing scenarios and the quantification of their relative support, and the estimation of parameter posterior distributions under a given scenario. We also considered a set of scenarios often compared when making ABC inferences on the routes of introduction of invasive species to illustrate the interest of the new model checking option of DIYABC to assess model misfit.

Conclusions: Our new developments of the integrated software DIYABC should be particularly useful to make inference on complex evolutionary scenarios involving both recent and ancient historical events and using various types of molecular markers in diploid or haploid organisms. They offer a handy way for non-expert users to achieve model checking computation within an ABC framework, hence filling up a gap of ABC analysis. The software DIYABC V1.0 is freely available at http://www1.montpellier.inra.fr/CBGP/diyabc.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Evolutionary scenarios to evaluate the interest of mixing microsatellite with mtDNA and/or nuclear DNA sequence data. The three presented scenarios involve different number of divergence and admixture events that occurred at recent to ancient times. Scenario 1 includes six populations, the four that have been sampled (30 diploïd individuals per population) and two unsampled parental populations in the admixture events. The two admixed populations are those represented by samples 2 and 3. Scenario 2 and 3 include five and four populations, respectively. Scenario 2 includes a single admixed population represented by sample 2. Scenario 3 does not include any admixed population. For all scenarios, samples 3 and 4 have been collected 2 and 4 generations earlier than the first two samples, hence their slightly upward locations on the graphs. Time is not at scale. See text of Methods (section "Mixing microsatellite, mtDNA, and/or nuclear DNA sequence data") for details regarding prior distributions of microsatellite and sequence markers.
Figure 2
Figure 2
Evolutionary scenarios to illustrate model checking. The three presented scenarios are often compared when making ABC inferences on the routes of introduction of invasive species. S is the source population in the native area, and U, the unsampled population in the introduced area that is the source of populations 1 and 2 in scenario 3. The stars indicate the bottleneck events occurring in the first few generations following introductions. We here considered that the dates of first observation were well known so that divergence times could be fixed at 5, 10, 15 and 20 generations for t1, t2, t3 and t4, respectively. The data sets consisted of simulated genotypes at 20 (independent) microsatellite loci obtained from a sample of diploid individuals collected from the invasive and source populations (30 individuals per population). The pseudo-observed test data set that we analyzed to illustrate model checking was simulated under scenario 3 with an effective population size (NS) of 10,000 diploid individuals in all populations except during the bottleneck events corresponding to an effective population size (NFi) of 10 diploid individuals for 5 generations. Prior distributions for ABC analyses (discrimination of scenarios and estimation of posterior distribution of parameters) were as followed: Uniform[1000; 20000] for and logUniform[2; 100] for the demographic parameters NS and NFi, respectively, and same distributions as those given in the text of Methods (section "Mixing microsatellite, mtDNA, and/or nuclear DNA sequence data") for microsatellite markers.
Figure 3
Figure 3
Confidence in discriminating evolutionary scenarios using microsatellite, mtDNA and/or nuclear DNA sequence data. The three compared scenarios are detailed in Figure 1. Type I error: exclude scenario x when it is actually scenario x. Type II error: choose scenario x when it is not scenario x. Results are based on 500 simulated data sets per scenario with parameter values drawn from the same distributions as the prior distributions given in the legend of Figure 1.
Figure 4
Figure 4
Precision in parameter estimation using microsatellite, mtDNA and/or nuclear DNA sequence data under scenario 1. Results are based on 500 pseudo-observed test data sets simulated and estimated under scenario 1 presented in Figure 1, with parameter values drawn from the same distributions as the prior distributions given in the legend of Figure 1. The demographic parameters N, t1, t2, t3, t4, t5, r1 and r2 are detailed in Figure 1. RMAE: relative median absolute errors. The blue columns correspond to the "base-level" RMAE values obtained using only the prior information on parameters (no genetic data).

Similar articles

Cited by

References

    1. Beaumont MA, Rannala B. The Bayesian revolution in genetics. Nat Rev Genet. 2004;5:251–261. doi: 10.1038/nrg1318. - DOI - PubMed
    1. Excoffier L, Heckel G. Computer programs for population genetics data analysis: a survival guide. Nat Rev Genet. 2006;7:745–758. doi: 10.1038/nrg1904. - DOI - PubMed
    1. Beaumont MA, Zhang WY, Balding DJ. Approximate Bayesian computation in population genetics. Genetics. 2002;162:2025–2035. - PMC - PubMed
    1. Bertorelle G, Bonazzo A, Mona S. ABC as a flexible framework to estimate demography over space and time: some cons, many pros. Mol Ecol. 2010. in press . - PubMed
    1. Csilléry K, Blum MGB, Gaggiotti O, François O. Approximate Bayesian Computation (ABC) in practice. Trends Ecol Evol. 2010;25:410–418. doi: 10.1016/j.tree.2010.04.001. - DOI - PubMed

Publication types

Substances

LinkOut - more resources