Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 1;29(13):i180-8.
doi: 10.1093/bioinformatics/btt239.

Inference of historical migration rates via haplotype sharing

Affiliations

Inference of historical migration rates via haplotype sharing

Pier Francesco Palamara et al. Bioinformatics. .

Abstract

Summary: Pairs of individuals from a study cohort will often share long-range haplotypes identical-by-descent. Such haplotypes are transmitted from common ancestors that lived tens to hundreds of generations in the past, and they can now be efficiently detected in high-resolution genomic datasets, providing a novel source of information in several domains of genetic analysis. Recently, haplotype sharing distributions were studied in the context of demographic inference, and they were used to reconstruct recent demographic events in several populations. We here extend the framework to handle demographic models that contain multiple demes interacting through migration. We extensively test our formulation in several demographic scenarios, compare our approach with methods based on ancestry deconvolution and use this method to analyze Masai samples from the HapMap 3 dataset.

Availability: DoRIS, a Java implementation of the proposed method, and its source code are freely available at http://www.cs.columbia.edu/~pier/doris.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
An IBD segment (blue) is co-inherited by two present day individuals from a common ancestor that lived four generations in the past. Recombination shortens the IBD segment, as meiotic events occur along the lineage between the two individuals
Fig. 2.
Fig. 2.
Two demographic models that involve two populations and migration between them. In model (a), the populations have the same constant size Ne, and exchange individuals at the same rate m. In model (b), a population of constant ancestral size Natot splits G generations in the past, resulting in two populations whose sizes independently fluctuate from formula image and formula image individuals to formula image and formula image individuals during G generations. During this period, the populations interact with asymmetric migration rates m12 and m21
Fig. 3.
Fig. 3.
True versus inferred parameters for the model in Figure 2a. Estimates were obtained using Equation (10)
Fig. 4.
Fig. 4.
Inference of recent effective population size using Equation (3), which neglects migration. The ratio between inferred and true population size (y-axis) increases as the migration rate (x-axis) is increased, approaching the sum of population sizes for both populations (twice the true size)
Fig. 5.
Fig. 5.
Results of the evaluation of our method on synthetic populations with demographic history depicted in the model of Figure 2b. Higher variance in the method’s accuracy is observed because of limited sample sizes and increased population sizes. Higher migration rates further decrease the rate of coalescent events in the recent generations (Fig. 5b), resulting in additional uncertainty. However, no significant bias is observed in the inference
Fig. 6.
Fig. 6.
We simulated a chromosome of 150 cM for 600 individuals using the model in Figure 2a, setting population sizes to 4000 and 12 000 diploid individuals, with a migration rate of 0.04. IBD sharing was extracted directly from the simulated genealogy (diamonds), or inferred trough GERMLINE using perfectly phased (circles) or computationally phased (triangles) chromosomes
Fig. 7.
Fig. 7.
The model used to simulate admixed populations
Fig. 8.
Fig. 8.
We created several simulation genotype datasets using the model in Figure 7, varying Gs while keeping formula image, and using constant populations of size 5000 or 10 000 diploid individuals. We inferred the value of m using PCAdmix + Tracts, or GERMLINE + DoRIS, here reported as a function of Gs
Fig. 9.
Fig. 9.
We created several datasets using the model in Figure 7, varying Gs from 200 to 6000, and using formula image with population sizes of 10 000 diploid individuals. We inferred the value of m using PCAdmix + Tracts from phased genotype data

References

    1. Albrechtsen A, et al. Natural selection and the distribution of identity-by-descent in the human genome. Genetics. 2010;186:295–308. - PMC - PubMed
    1. Atzmon G, et al. Abraham’s children in the genome era: major jewish diaspora populations comprise distinct genetic clusters with shared middle eastern ancestry. Am. J. Hum. Genet. 2010;86:850–859. - PMC - PubMed
    1. Brisbin A, et al. Pcadmix: Principal components-based assignment of ancestry along each chromosome in individuals with admixed ancestry from two or more populations. Hum. Biol. 2012;84:343–364. - PMC - PubMed
    1. Browning B, Browning S. A fast, powerful method for detecting identity by descent. Am. J. Hum. Genet. 2011;88:173–182. - PMC - PubMed
    1. Browning S, Thompson E. Detecting rare variant associations by identity-by-descent mapping in case-control studies. Genetics. 2012;190:1521–1531. - PMC - PubMed

Publication types