Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb 2;110(2):326-335.
doi: 10.1016/j.ajhg.2022.12.010. Epub 2023 Jan 6.

Fast, accurate local ancestry inference with FLARE

Affiliations

Fast, accurate local ancestry inference with FLARE

Sharon R Browning et al. Am J Hum Genet. .

Abstract

Local ancestry is the source ancestry at each point in the genome of an admixed individual. Inferred local ancestry is used for admixture mapping and population genetic analyses. We present FLARE (fast local ancestry estimation), a method for local ancestry inference. FLARE achieves high accuracy through the use of an extended Li and Stephens model, and it achieves exceptional computational performance through incorporation of computational techniques developed for genotype imputation. Memory requirements are reduced through on-the-fly compression of reference haplotypes and stored checkpoints. Computation time is reduced through the use of composite reference haplotypes. These techniques allow FLARE to scale to datasets with hundreds of thousands of sequenced individuals and to provide superior accuracy on large-scale data. FLARE is open source and available at https://github.com/browning-lab/flare.

Keywords: admixture; ancestry; local ancestry inference.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

Figure 1
Figure 1
Computation time for simulated sequence data with three-way admixture Wallclock computation time in hours is shown on the y axis. Reference panel size for each of the three ancestries is shown on the x axis. Each analysis includes 100 admixed individuals, and results are averaged over four replicate simulations. Error bars (±2 standard errors) are shown as gray lines. Analyses of a simulated chromosome modeled on human chromosome 20 were run with 20 compute threads. MOSAIC could not analyze the data with 1,000 or more individuals per reference panel within the available 384 GB of computer memory. RFMix could not analyze the data with 10,000 individuals per reference panel within 48 h.
Figure 2
Figure 2
Accuracy when increasing reference panel size for simulated sequence data with three-way admixture The y axis shows squared correlation between true and inferred local ancestry dose averaged over ancestries and across four replicate simulations (details in subjects and methods). Error bars (±2 standard errors) are shown as gray lines. Each of the three ancestries is represented by a reference panel of size shown on the x axis, and each analysis includes 100 admixed individuals. MOSAIC could not analyze the data with 1,000 or more individuals per reference panel within the available 384 GB of computer memory. RFMix could not analyze the data with 10,000 individuals per reference panel within the allotted two days.
Figure 3
Figure 3
Accuracy by ancestry for simulated sequence data with four-way admixture The y axis is the squared correlation between the true and inferred ancestry dose for a single ancestry, averaged across four replicate simulations. Error bars (±2 standard errors) are shown as gray lines. The ancestry is shown on the x axis (AFR is simulated West African, EUR is simulated European, CHB is simulated Han Chinese, JPT is simulated Japanese). The simulated sequence data have 100 admixed individuals and 400 individuals in each of the four reference panels. Results are averaged over four replicate simulations. MOSAIC could not analyze these data within the available 384 GB of computer memory
Figure 4
Figure 4
Accuracy for simulated sequence data with two-way admixture The y axis shows squared correlation between true and inferred local ancestry dose averaged over ancestries and across four replicate simulations (details in subjects and methods). Error bars (±2 standard errors) are shown as gray lines. The x axis shows the split time for the two ancestral populations. Each of the three ancestries is represented by a reference panel of size 200, and each analysis includes 100 admixed individuals.

References

    1. Hellenthal G., Busby G.B.J., Band G., Wilson J.F., Capelli C., Falush D., Myers S. A genetic atlas of human admixture history. Science. 2014;343:747–751. - PMC - PubMed
    1. Bryc K., Durand E.Y., Macpherson J.M., Reich D., Mountain J.L. The genetic ancestry of African Americans, Latinos, and European Americans across the United States. Am. J. Hum. Genet. 2015;96:37–53. - PMC - PubMed
    1. Gravel S., Zakharia F., Moreno-Estrada A., Byrnes J.K., Muzzio M., Rodriguez-Flores J.L., Kenny E.E., Gignoux C.R., Maples B.K., Guiblet W., et al. Reconstructing Native American migrations from whole-genome and whole-exome data. PLoS Genet. 2013;9:e1004023. - PMC - PubMed
    1. Homburger J.R., Moreno-Estrada A., Gignoux C.R., Nelson D., Sanchez E., Ortiz-Tello P., Pons-Estel B.A., Acevedo-Vasquez E., Miranda P., Langefeld C.D., et al. Genomic insights into the ancestry and demographic history of South America. PLoS Genet. 2015;11:e1005602. - PMC - PubMed
    1. Green R.E., Krause J., Briggs A.W., Maricic T., Stenzel U., Kircher M., Patterson N., Li H., Zhai W., Fritz M.H.Y., et al. A draft sequence of the Neandertal genome. Science. 2010;328:710–722. - PMC - PubMed

Publication types

LinkOut - more resources