Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Sep 8;5(9):e12600.
doi: 10.1371/journal.pone.0012600.

Multiethnic genetic association studies improve power for locus discovery

Affiliations

Multiethnic genetic association studies improve power for locus discovery

Sara L Pulit et al. PLoS One. .

Abstract

To date, genome-wide association studies have focused almost exclusively on populations of European ancestry. These studies continue with the advent of next-generation sequencing, designed to systematically catalog and test low-frequency variation for a role in disease. A complementary approach would be to focus further efforts on cohorts of multiple ethnicities. This leverages the idea that population genetic drift may have elevated some variants to higher allele frequency in different populations, boosting statistical power to detect an association. Based on empirical allele frequency distributions from eleven populations represented in HapMap Phase 3 and the 1000 Genomes Project, we simulate a range of genetic models to quantify the power of association studies in multiple ethnicities relative to studies that exclusively focus on samples of European ancestry. In each of these simulations, a first phase of GWAS in exclusively European samples is followed by a second GWAS phase in any of the other populations (including a multiethnic design). We find that nontrivial power gains can be achieved by conducting future whole-genome studies in worldwide populations, where, in particular, African populations contribute the largest relative power gains for low-frequency alleles (<5%) of moderate effect that suffer from low power in samples of European descent. Our results emphasize the importance of broadening genetic studies to worldwide populations to ensure efficient discovery of genetic loci contributing to phenotypic trait variability, especially for those traits for which large numbers of samples of European ancestry have already been collected and tested.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Power to detect association for lower-frequency alleles (≤5%) in CEU based on HapMap 3 data.
Power is given for various individual population panels (CEU, TSI, YRI, MKK, LWK, and ASW), a panel with major continental representation (CEU+CHB+YRI) and a cosmopolitan panel with major continental representation and admixed populations (GIH, MXL, and ASW) interrogated in phase 2, aggregated over those alleles that have lower frequency (1–5%) in CEU.
Figure 2
Figure 2. Power as a function of allele frequency (≤5%) in CEU based on HapMap 3 data.
For a sample size of 80,000 and a modest effect size (GRR of 1.1 and 1.2), power is given for CEU, CHB, YRI, and two multiethnic panels (“major continental”, CEU+CHB+YRI, and “cosmopolitan”, CEU+CHB+YRI+ASW+GIH+MXL) in phase 2. Including non-European samples in phase 2 improves power to detect an association for alleles that have lower frequency in CEU.
Figure 3
Figure 3. Power to detect association for lower-frequency alleles (≤5%) in CEU using 1000 Genomes Project data.
Power is given for three individual panels (CEU, CHB+JPT, YRI), and a multiethnic panel (CEU+CHB+JPT+YRI) in phase 2, aggregated over those alleles that have lower frequency (1–5%) in CEU.
Figure 4
Figure 4. Power as a function of allele frequency (≤5%) in CEU using 1000 Genomes Project data.
For a sample size of 80,000 and modest effect size (GRR of 1.1 and 1.2), power is given for three individual panels (CEU, CHB+JPT, YRI) and a multiethnic panel in phase 2. Including non-European samples in phase 2 improves power to detect an association for alleles that have lower frequency in CEU.
Figure 5
Figure 5. The relationship between allele frequency differences between CEU and YRI and power.
We plot the histogram of all SNPs in the 1000 Genomes Project data as a function the allele frequency difference between CEU and YRI (excluding SNPs monomorphic in both CEU and YRI). The histogram is colour-coded by the estimated change in power by performing phase 2 in YRI instead of CEU, assuming a total sample size of 20,000 (10,000 in CEU in phase 1, and 10,000 in YRI in phase 2) and a GRR of 1.2. Allele frequency differences from +15% to +40% in YRI result in a positive gain in power (in red), which is compensated by SNPs that are common in CEU (in blue). We divide the histogram into 4 categories: (1) SNPs with at least 80% power in both scenarios (CEU in phase 2 or YRI in phase 2) (65.6% of all SNPs considered), (2) SNPs with at least 80% power to detect an association in the European GWAS (CEU in phase 2) (6.5% of all SNPs considered), (3) SNPs with at least 80% power in the African GWAS (YRI in phase 2) (9.3%), and (4) SNPs that do not reach 80% in either of these two scenarios (18.6%). As alleles of higher frequency in CEU are mostly saturated for power, including additional European samples in GWAS will only marginally increase power, whereas alleles of lower frequency in CEU may substantially benefit in terms of power from elevated frequencies in African populations.

References

    1. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, et al. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. - PMC - PubMed
    1. Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005;6:95–108. - PubMed
    1. Pe'er I, de Bakker PIW, Maller J, Yelensky R, Altshuler D, et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat Genet. 2006;38:663–667. - PubMed
    1. Rosenberg NA, Huang L, Jewett EM, Szpiech ZA, Jankovic I, et al. Genome-wide association studies in diverse populations. Nat Rev Genet. 11:356–366. - PMC - PubMed
    1. Unoki H, Takahashi A, Kawaguchi T, Hara K, Horikoshi M, et al. SNPs in KCNQ1 are associated with susceptibility to type 2 diabetes in East Asian and European populations. Nat Genet. 2008;40:1098–1102. - PubMed

Publication types

LinkOut - more resources