Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 May;74(5):965-78.
doi: 10.1086/420855. Epub 2004 Apr 14.

Design and analysis of admixture mapping studies

Affiliations

Design and analysis of admixture mapping studies

C J Hoggart et al. Am J Hum Genet. 2004 May.

Abstract

Admixture between populations originating on different continents can be exploited to detect disease susceptibility loci at which risk alleles are distributed differentially between these populations. We first examine the statistical power and mapping resolution of this approach in the limiting situation in which gamete admixture and locus ancestry are measured without uncertainty. We show that, for a rare disease, the most efficient design is to study affected individuals only. In a typical African American population (two-way admixture proportions 0.8/0.2, ancestry crossover rate 2 per 100 cM), a study of 800 affected individuals has 90% power to detect at P values <10(-5) a locus that generates a risk ratio of 2 between populations, with an expected mapping resolution (size of 95% confidence region for the position of the locus) of 4 cM. In practice, to infer locus ancestry from marker data requires Bayesian computationally intensive methods, as implemented in the program ADMIXMAP. Affected-only study designs require strong prior information on the frequencies of each allele given locus ancestry. We show how data from unadmixed and admixed populations can be combined to estimate these ancestry-specific allele frequencies within the admixed population under study, allowing for variation between allele frequencies in unadmixed and admixed populations. Using simulated data based on the genetic structure of the African American population, we show that 60% of information can be extracted in a test for linkage using markers with an ancestry information content of 36% at 3-cM spacing. As in classic linkage studies, the most efficient strategy is to use markers at a moderate density for an initial genome search and then to saturate regions of putative linkage with additional markers, to extract nearly all information about locus ancestry.

PubMed Disclaimer

Figures

Figure  1
Figure 1
Median and upper 95th percentile of the size of the 95% confidence region for the position of a disease locus plotted against the risk ratio for a fixed sample size of 800 individuals with admixture proportion 0.8 from the high-risk population and τ=6.
Figure  2
Figure 2
Plots of formula image P values for simulated data from a chromosome of length 100 cM with disease locus responsible for a risk ratio of 2 at 50 cM. Solid line, markers spaced every 1 cM; dotted line, markers spaced every 3 cM; dashed line, markers randomly spaced with an average spacing of 3 cM.
Figure  3
Figure 3
Information content map for simulated data from a chromosome of length 100 cM without a disease locus. Solid line, markers spaced every 1 cM; dotted line, markers spaced every 3 cM; dashed line, markers randomly spaced with an average spacing of 3 cM.
Figure  4
Figure 4
Exclusion map for simulated data from a chromosome of length 100 cM with disease locus responsible for a risk ratio of 2 at 50 cM. Solid line, markers spaced every 1 cM; dotted line, markers spaced every 3 cM; dashed line, markers randomly spaced with an average spacing of 3 cM.
Figure  5
Figure 5
Exclusion map for simulated data from a chromosome of length 100 cM without a disease locus. Solid line, markers spaced every 1 cM; dotted line, markers spaced every 3 cM; dashed line, markers randomly spaced with an average spacing of 3 cM.
Figure  6
Figure 6
Plot of P values obtained in a test for misspecified African allele frequencies for the model specified with frequency estimates from unadmixed West African populations (horizontal axis) and the model specified with frequency estimates by combining data from unadmixed and admixed populations (Washington, DC and Philadelphia) in a dispersion model (vertical axis). Loci for which the misspecification test is significant at P values <.01 for allele frequencies specified by the first model are shown as blackened squares.

References

Electronic-Database Information

    1. Genetic Epidemiology Group, London School of Hygiene & Tropical Medicine, http://www.lshtm.ac.uk/eu/genetics/index.html (for the ADMIXMAP program)

References

    1. Cavalli-Sforza LL, Menoozz P, Piazzi A (1994) The history and geography of human genes. Princeton University Press, Princeton
    1. Chakraborty R, Weiss KM (1988) Admixture as a tool for finding linked genes and detecting that difference from allelic association between loci. Proc Natl Acad Sci USA 85:9119–9123 - PMC - PubMed
    1. Devlin B, Roeder K, Wasserman L (2003) Analysis of multilocus models of association. Genet Epidemiol 25:36–4710.1002/gepi.10237 - DOI - PubMed
    1. Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587 - PMC - PubMed
    1. Gilks WF, Wild P (1992) Adaptive rejection sampling for Gibbs sampling. Appl Statist 41:337–348

Publication types

Substances