Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2010 May;11(5):356-66.
doi: 10.1038/nrg2760.

Genome-wide association studies in diverse populations

Affiliations
Review

Genome-wide association studies in diverse populations

Noah A Rosenberg et al. Nat Rev Genet. 2010 May.

Abstract

Genome-wide association (GWA) studies have identified a large number of SNPs associated with disease phenotypes. As most GWA studies have been performed in populations of European descent, this Review examines the issues involved in extending the consideration of GWA studies to diverse worldwide populations. Although challenges exist with issues such as imputation, admixture and replication, investigation of a greater diversity of populations could make substantial contributions to the goal of mapping the genetic determinants of complex diseases for the human population as a whole.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Differences in “mappability” of a risk variant between two populations with different LD patterns
A disease mutation (orange) occurs on an ancestral chromosome that contains several marker alleles (green, purple, blue, yellow). Over time, recombination events (diamonds) break down the correlations between the disease mutation and the marker alleles. However, the recombination history differs for populations 1 and 2, separated by a barrier to gene flow (brown line). Consequently, if the purple or blue allele were examined in population 1, then a disease association might be found, but it might not be found in population 2. A similar situation applies for the yellow allele, with the roles of the populations reversed. The figure and caption are modified from Rosenberg and VanLiere.
Figure 2
Figure 2. Effect of frequency in Europe on the occurrence of an allele in other regions
The figure illustrates that alleles that are more common in one group, in this case Europeans, are more likely to be present in other groups. It also shows that populations that are geographically closer to Europe, such as populations of the Middle East, tend to have more alleles shared with Europeans than more geographically distant populations, such as those of Oceania. The figure is based on the SNP data underlying Figure S21 of Jakobsson et al. , which uses 512,762 autosomal SNPs in indigenous populations from the Human Genome Diversity Panel, and which standardizes sample sizes across groups by evaluating allele frequencies in samples of size 40.
Figure 3
Figure 3. Excess SNP variability in Europeans resulting from ascertainment bias
The y-axis depicts mean heterozygosity across loci in 443 individuals from 29 populations, on the basis of 512,762 autosomal SNPs from an Illumina genotyping panel. The x-axis depicts mean heterozygosity in the same individuals, on the basis of 783 autosomal microsatellite markers,. Because individual microsatellites, unlike SNPs, are highly variable, microsatellite ascertainment is less dependent on the initial ascertainment sample than is SNP ascertainment. Thus, the imperfect correlation of SNP heterozygosity with microsatellite heterozygosity might reflect ascertainment bias in the SNP set. This figure is similar to Figure 3 of Conrad et al. .
Figure 4
Figure 4. Genotype imputation accuracy in 29 populations, with and without external reference panels
Imputation accuracy is plotted as a function of LD measured by mean r2 at a distance of 10 kb in a genome-wide dataset. Genotypes in a genome-wide study are hidden and then imputed, with two different designs. In the shaded region, genotypes in each population are imputed without an external reference panel, so that the information for imputing “missing” genotypes comes from other individuals in the population. In the unshaded region, genotypes in the population are imputed using an external reference panel, chosen optimally among 36 mixtures of the HapMap CEU (European American), CHB+JPT (Chinese and Japanese), and YRI (Yoruba) panels. Color coding for populations follows that of Fig. 3. The regression lines exclude the African populations, and they have coefficients of determination 0.003 (external reference) and 0.953 (internal reference). The figure shows that imputation accuracy based on an internal reference is highly correlated with LD. However, imputation accuracy based on an external reference is not correlated with LD (and instead depends on the composition of the particular reference panels available). The figure is based on the data in scenarios 1, 3, and 6 in Table 1 of Huang et al. .
Figure 5
Figure 5. Imputation in admixed populations
Admixture segments are estimated in each individual sampled from a GWA study. Consider reference haplotypes from two separate panels (red and blue boxes). Separately for each admixture segment of a haplotype, alleles are imputed using reference haplotypes from the same population as the inferred source. Within a source population, a haplotype might have alleles imputed from multiple reference haplotypes, as depicted on the left with both haplotypes from the same (blue) source population serving as imputation templates. If admixture estimates for a segment are uncertain, then conditional imputations at a site given each of the possible source populations for the segment can be weighted by the probabilities of those sources.

References

    1. McCarthy MI, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Rev Genet. 2008;9:356–369. - PubMed
    1. Frazer KA, et al. Human genetic variation and its contribution to complex traits. Nature Rev Genet. 2009;10:241–251. - PubMed
    1. Altshuler D, et al. Genetic mapping in human disease. Science. 2008;322:881–888. - PMC - PubMed
    1. Hardy J, Singleton A. Genomewide association studies and human disease. N Engl J Med. 2009;360:1759–1768. - PMC - PubMed
    1. Manolio TA, et al. A HapMap harvest of insights into the genetics of common disease. J Clin Invest. 2008;118:1590–1605. - PMC - PubMed

Publication types