Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2008 Jan 2;3(1):e1382.
doi: 10.1371/journal.pone.0001382.

Correction of population stratification in large multi-ethnic association studies

Affiliations
Multicenter Study

Correction of population stratification in large multi-ethnic association studies

David Serre et al. PLoS One. .

Abstract

Background: The vast majority of genetic risk factors for complex diseases have, taken individually, a small effect on the end phenotype. Population-based association studies therefore need very large sample sizes to detect significant differences between affected and non-affected individuals. Including thousands of affected individuals in a study requires recruitment in numerous centers, possibly from different geographic regions. Unfortunately such a recruitment strategy is likely to complicate the study design and to generate concerns regarding population stratification.

Methodology/principal findings: We analyzed 9,751 individuals representing three main ethnic groups - Europeans, Arabs and South Asians - that had been enrolled from 154 centers involving 52 countries for a global case/control study of acute myocardial infarction. All individuals were genotyped at 103 candidate genes using 1,536 SNPs selected with a tagging strategy that captures most of the genetic diversity in different populations. We show that relying solely on self-reported ethnicity is not sufficient to exclude population stratification and we present additional methods to identify and correct for stratification.

Conclusions/significance: Our results highlight the importance of carefully addressing population stratification and of carefully "cleaning" the sample prior to analyses to obtain stronger signals of association and to avoid spurious results.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Distribution of pair-wise allele sharing among the INTERHEART European individuals.
The graph shows the QQ plot of the distribution of all pair-wise measures of allele sharing against a normal distribution (the red line displays the expectation). The green line shows to the empirical cut-off used to identify related individuals (correspond to an allele sharing larger than 83%). The deviation on the left-hand side of the graph (i.e. low allele sharing) corresponds to pairs of individuals originating from different sub-populations.
Figure 2
Figure 2. Genetic clustering of the INTERHEART individuals inferred by STRUCTURE.
“European” (blue dots), “Arabs” (green dots) and “South Asian” (pink dots) individuals are displayed according to their coefficients of ancestry in three populations (K = 3) as estimated by STRUCTURE using 127 SNPs. The coefficients of ancestry display separately for each population samples were inferred from a single analysis (i.e. all individuals combined) and are represented using the same axes. See also Supplemental Figure S2 for the distribution of the coefficients of ancestry.
Figure 3
Figure 3. Distribution of the p-values of the associations between genotypes at 1,453 SNPs and ApoB level in South-Asians.
The plot shows the observed distribution of the p-values (y-axis) against the expectation under a model without any association (grey crosses and x-axis). The axes are in logarithmic scales. Red crosses correspond to the association between ApoB and the genotypes at one SNP without any correction. Blue crosses stand for the same tests using recruitment centers used as additional covariates.

Similar articles

Cited by

References

    1. The International HapMap Consortium. A haplotype map of the human genome. Nature. 2005;437:1299–1320. - PMC - PubMed
    1. Nelson MR, Klotsman M, McNeill AM, Maruyama Y, Bowman CE, et al. Development of a densely genotyped population reference sample: a resource for population, disease, and pharmacological genetics research; 2006; Brisbane, Australia.
    1. Manolio TA, Bailey-Wilson JE, Collins FS. Genes, environment and the value of prospective cohort studies. Nat Rev Genet. 2006;7:812–820. - PubMed
    1. Davey Smith G, Ebrahim S, Lewis S, Hansell AL, Palmer LJ, et al. Genetic epidemiology and public health: hope, hype, and future prospects. Lancet. 2005;366:1484–1498. - PubMed
    1. Yusuf S, Hawken S, Ounpuu S, Dans T, Avezum A, et al. Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): case-control study. Lancet. 2004;364:937–952. - PubMed

Publication types

Substances