Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jun;200(2):469-81.
doi: 10.1534/genetics.115.176842. Epub 2015 Apr 7.

Reconstructing Past Admixture Processes from Local Genomic Ancestry Using Wavelet Transformation

Affiliations

Reconstructing Past Admixture Processes from Local Genomic Ancestry Using Wavelet Transformation

Jean Sanderson et al. Genetics. 2015 Jun.

Abstract

Admixture between long-separated populations is a defining feature of the genomes of many species. The mosaic block structure of admixed genomes can provide information about past contact events, including the time and extent of admixture. Here, we describe an improved wavelet-based technique that better characterizes ancestry block structure from observed genomic patterns. principal components analysis is first applied to genomic data to identify the primary population structure, followed by wavelet decomposition to develop a new characterization of local ancestry information along the chromosomes. For testing purposes, this method is applied to human genome-wide genotype data from Indonesia, as well as virtual genetic data generated using genome-scale sequential coalescent simulations under a wide range of admixture scenarios. Time of admixture is inferred using an approximate Bayesian computation framework, providing robust estimates of both admixture times and their associated levels of uncertainty. Crucially, we demonstrate that this revised wavelet approach, which we have released as the R package adwave, provides improved statistical power over existing wavelet-based techniques and can be used to address a broad range of admixture questions.

Keywords: admixture; dating; local ancestry; principal component analysis (PCA); wavelets.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Simulated example with 13,000 SNPs, 15 diploid individuals in ancestral populations (PA, PB), and 20 diploid individuals in the admixed population (PC). Populations are shown in green (PA), blue (PB), and red (PC). (A) PCA is used to describe the primary population structure; (B) raw wavelet variance for each population illustrates high frequency noise; (C) informative variation in the admixed population after standard correction for noise estimated from the ancestral populations. Note that this example uses the default threshold μ = 1.
Figure 2
Figure 2
Informative wavelet variance for each time of admixture (10–320 generations using default thresholding μ=1). Shaded bars represent the average over 50 simulations at each admixture time; black bars represent the range across individual simulations. The average block size metric for each scenario is indicated by a dotted blue line.
Figure 3
Figure 3
Relationship between proportion of admixture and informative wavelet variance. For this example only, a nondefault value for the threshold μ=1.1 was used to account for increased noise in the admixture signals due to low proportions of admixture, as described in the text. The magnitude of the wavelet variance decreases with the admixture proportion, shown as colored bars from black (P = 0.50) to yellow (P = 0.025).
Figure 4
Figure 4
Sensitivity to a range of realistic data limitations. Comparison to reference data (condition 1) simulated with T = 13,000 SNPs, populations sizes nA,nB=15, nC=15, and ancestral population divergence at TAncestral=2000 generations. The gray area shows the range of ABS metrics observed under the standard reference condition. (A) Potential sources of error (conditions 2–5); (B) varying SNP densities (conditions 6–8). Note that the decline in absolute values of the ABS metrics in B is expected; these are easily accounted for in an inference setting because the SNP density is always a known variable. Condition descriptions and numeric values are presented in Table 2.
Figure 5
Figure 5
Comparing StepPCO and adwave showing the relationship between wavelet transform summaries and time of admixture. (A) Adwave using μ=1; (B) StepPCO using K=1024, λ=5, threshold = 0.1, and maxlevel = 6. Numbers indicate the relative standard deviation (RSD, %) for each admixture time. Note the difference in discrimination power between the two methods for older admixture events (95% confidence intervals as dashed blue and green horizontal lines).
Figure 6
Figure 6
PCA of autosomal SNP data from Indonesian populations, with Southern Han Chinese (blue circles) and Papua New Guinea Highlanders (green circles) employed as proxy ancestral populations. Numbers give calculated admixture proportions.
Figure 7
Figure 7
Dating time of admixture for Bena (Flores, eastern Indonesia) using approximate Bayesian computation. (A) Relationship between admixture time and average block size metric for all simulations; (B) weighted posterior distribution of admixture time. Median estimated time of admixture, indicated by the blue line, is 147 generations (95% credible region: 122–178 generations).
Figure 8
Figure 8
Dual admixture events at 160 and 10–80 generations. Gray bars represent the average over 50 simulations for each scenario; black bars represent the range for individual simulations. Blue bars show the average informative wavelet variance for a single admixture event at 160 generations, providing a reference point for comparison.

References

    1. Baran Y., Pasaniuc B., Sankararaman S., Torgerson D. G., Gignoux C., et al. , 2012. Fast and accurate inference of local ancestry in Latino populations. Bioinformatics 28: 1359–1367. - PMC - PubMed
    1. Beaumont M. A., Zhang W., Balding D. J., 2002. Approximate Bayesian Computation in population genetics. Genetics 162: 2025–2035. - PMC - PubMed
    1. Bellwood P., 2007. Prehistory of the Indo-Malaysian Archipelago. ANU E Press, Canberra, Australia.
    1. Brisbin A., Bryc K., Byrnes J., Zakharia F., Omberg L., et al. , 2012. PCAdmix: principal components-based assignment of ancestry along each chromosome in individuals with admixed ancestry from two or more populations. Hum. Biol. 84: 343–364. - PMC - PubMed
    1. Brown R., Pasaniuc B., 2014. Enhanced methods for local ancestry assignment in sequenced admixed individuals. PLOS Comput. Biol. 10: e1003555. - PMC - PubMed

Publication types