Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Dec 28:12:giad059.
doi: 10.1093/gigascience/giad059. Epub 2023 Aug 23.

GADMA2: more efficient and flexible demographic inference from genetic data

Affiliations
Review

GADMA2: more efficient and flexible demographic inference from genetic data

Ekaterina Noskova et al. Gigascience. .

Abstract

Background: Inference of complex demographic histories is a source of information about events that happened in the past of studied populations. Existing methods for demographic inference typically require input from the researcher in the form of a parameterized model. With an increased variety of methods and tools, each with its own interface, the model specification becomes tedious and error-prone. Moreover, optimization algorithms used to find model parameters sometimes turn out to be inefficient, for instance, by being not properly tuned or highly dependent on a user-provided initialization. The open-source software GADMA addresses these problems, providing automatic demographic inference. It proposes a common interface for several likelihood engines and provides global parameters optimization based on a genetic algorithm.

Results: Here, we introduce the new GADMA2 software and provide a detailed description of the added and expanded features. It has a renovated core code base, new likelihood engines, an updated optimization algorithm, and a flexible setup for automatic model construction. We provide a full overview of GADMA2 enhancements, compare the performance of supported likelihood engines on simulated data, and demonstrate an example of GADMA2 usage on 2 empirical datasets.

Conclusions: We demonstrate the better performance of a genetic algorithm in GADMA2 by comparing it to the initial version and other existing optimization approaches. Our experiments on simulated data indicate that GADMA2's likelihood engines are able to provide accurate estimations of demographic parameters even for misspecified models. We improve model parameters for 2 empirical datasets of inbred species.

Keywords: demographic inference; genetic algorithm; hyperparameter optimization; population genetics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1:
Figure 1:
Scheme of GADMA2. New features and enhancements are marked with a gradient gray color. GADMA2 takes input genetic data presented in either AFS or VCF formats, engine name, and model specifications and provides inferred model parameters, visualization, and descriptions of the final demographic history.
Figure 2:
Figure 2:
Demographic histories used in data simulations powered by stdpopsim [25] for performance comparison of GADMA2 likelihood engines. (a) History of African (AFR) and European (EUR) populations of Drosophila melanogaster from Li and Stephan [27]. (b) History of Pongo pygmaeus (Bornean) and Pongo abelii (Sumatran) orangutan species from Locke et al. [30].
Figure 3:
Figure 3:
Performance comparison of the initial GADMA and GADMA2 with new hyperparameters. Bar size illustrates the average speedup of GADMA2, defined as the fraction of log-likelihood evaluations saved by the new version for each dataset. Blue dashed line demonstrates the average fraction of saved evaluations across all datasets. The bars’ hatching patterns indicate the improvement of the result log-likelihood based on median and quartiles. GADMA2 with new hyperparameters attains the average speedup of 10% and provides better results on average compared to the default configuration.
Figure 4:
Figure 4:
Example convergence plots for the default genetic algorithm configuration from the initial version of GADMA (red) and configuration obtained during attempt 2 of hyperparameter optimization with SMAC (green) on 2 datasets: (a) training dataset 2_DivMig_5_Sim and (b) test dataset 3_DivMig_8_Sim. For each configuration, 128 independent optimization runs were performed. Solid lines correspond to median convergence over 128 run,s and shadowed areas are ranges between the first (0.25) and third (0.75) quartiles. The vertical dashed black line refers to the number of evaluations used to stop a genetic algorithm in SMAC.
Figure 5:
Figure 5:
Demographic histories for Texas and Florida populations of American puma inferred with GADMA2. Figures are generated with the demes package [51]. Time is presented on a log scale.
Figure 6:
Figure 6:
Demographic histories for a single population of domesticated cabbage inferred with GADMA2. Figures are generated with the demes package. In both models, the time of the most recent epoch is estimated to be small.

Similar articles

Cited by

References

    1. Der Sarkissian C, Ermini L, Schubert M et al. Evolutionary genomics and conservation of the endangered Przewalski’s horse. Curr Biol. 2015;25(19):2577–83. - PMC - PubMed
    1. Abascal F, Corvelo A, Cruz F, et al. Extreme genomic erosion after recurrent demographic bottlenecks in the highly endangered Iberian lynx. Genome Biol. 2016;17(1):1–19. - PMC - PubMed
    1. Payet SD, Pratchett MS, Saenz-Agudelo P et al. Demographic histories shape population genomics of the common coral grouper (Plectropomus leopardus). Evol Appl. 2022;15(8):1221–35. - PMC - PubMed
    1. Gutenkunst RN, Hernandez RD, Williamson SH, et al. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 2009;5(10):e1000695. - PMC - PubMed
    1. Gronau I, Hubisz MJ, Gulko B et al. Bayesian inference of ancient human demography from individual genome sequences. Nat Genet. 2011;43(10):1031–4. - PMC - PubMed

Publication types