Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data
- PMID: 27200085
- PMCID: PMC4858524
- DOI: 10.3389/fgene.2016.00076
Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data
Abstract
Population stratification or confounding by genetic ancestry is a potential cause of false associations in genetic association studies. Estimation of and adjustment for genetic ancestry has become common practice thanks in part to the availability of ancestry informative markers on genome-wide association study (GWAS) arrays. While array data is now widespread, these data are not ubiquitous as several large epidemiologic and clinic-based studies lack genome-wide data. One such large epidemiologic-based study lacking genome-wide data accessible to investigators is the National Health and Nutrition Examination Surveys (NHANES), population-based cross-sectional surveys of Americans linked to demographic, health, and lifestyle data conducted by the Centers for Disease Control and Prevention. DNA samples (n = 14,998) were extracted from biospecimens from consented NHANES participants between 1991-1994 (NHANES III, phase 2) and 1999-2002 and represent three major self-identified racial/ethnic groups: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We as the Epidemiologic Architecture for Genes Linked to Environment study genotyped candidate gene and GWAS-identified index variants in NHANES as part of the larger Population Architecture using Genomics and Epidemiology I study for collaborative genetic association studies. To enable basic quality control such as estimation of genetic ancestry to control for population stratification in NHANES san genome-wide data, we outline here strategies that use limited genetic data to identify the markers optimal for characterizing genetic ancestry. From among 411 and 295 autosomal SNPs available in NHANES III and NHANES 1999-2002, we demonstrate that markers with ancestry information can be identified to estimate global ancestry. Despite limited resolution, global genetic ancestry is highly correlated with self-identified race for the majority of participants, although less so for ethnicity. Overall, the strategies outlined here for a large epidemiologic study can be applied to other datasets accessible for genotype-phenotype studies but are sans genome-wide data.
Keywords: EAGLE; NHANES; cross-sectional; epidemiology; genetic epidemiology; global genetic ancestry; population stratification.
Figures



Similar articles
-
Cryptic relatedness in epidemiologic collections accessed for genetic association studies: experiences from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study and the National Health and Nutrition Examination Surveys (NHANES).Front Genet. 2015 Oct 26;6:317. doi: 10.3389/fgene.2015.00317. eCollection 2015. Front Genet. 2015. PMID: 26579192 Free PMC article.
-
Detection of pleiotropy through a Phenome-wide association study (PheWAS) of epidemiologic data as part of the Environmental Architecture for Genes Linked to Environment (EAGLE) study.PLoS Genet. 2014 Dec 4;10(12):e1004678. doi: 10.1371/journal.pgen.1004678. eCollection 2014 Dec. PLoS Genet. 2014. PMID: 25474351 Free PMC article.
-
Lipid trait-associated genetic variation is associated with gallstone disease in the diverse Third National Health and Nutrition Examination Survey (NHANES III).BMC Med Genet. 2013 Nov 21;14:120. doi: 10.1186/1471-2350-14-120. BMC Med Genet. 2013. PMID: 24256507 Free PMC article.
-
Self-reported race/ethnicity in the age of genomic research: its potential impact on understanding health disparities.Hum Genomics. 2015 Jan 7;9(1):1. doi: 10.1186/s40246-014-0023-x. Hum Genomics. 2015. PMID: 25563503 Free PMC article. Review.
-
Prevalence and Incidence of Type 2 Diabetes and Prediabetes.In: Cowie CC, Casagrande SS, Menke A, Cissell MA, Eberhardt MS, Meigs JB, Gregg EW, Knowler WC, Barrett-Connor E, Becker DJ, Brancati FL, Boyko EJ, Herman WH, Howard BV, Narayan KMV, Rewers M, Fradkin JE, editors. Diabetes in America. 3rd edition. Bethesda (MD): National Institute of Diabetes and Digestive and Kidney Diseases (US); 2018 Aug. CHAPTER 3. In: Cowie CC, Casagrande SS, Menke A, Cissell MA, Eberhardt MS, Meigs JB, Gregg EW, Knowler WC, Barrett-Connor E, Becker DJ, Brancati FL, Boyko EJ, Herman WH, Howard BV, Narayan KMV, Rewers M, Fradkin JE, editors. Diabetes in America. 3rd edition. Bethesda (MD): National Institute of Diabetes and Digestive and Kidney Diseases (US); 2018 Aug. CHAPTER 3. PMID: 33651562 Free Books & Documents. Review.
Cited by
-
KIDNEY DISEASE GENETICS AND THE IMPORTANCE OF DIVERSITY IN PRECISION MEDICINE.Pac Symp Biocomput. 2016;21:285-96. Pac Symp Biocomput. 2016. PMID: 26776194 Free PMC article.
-
TESTING POPULATION-SPECIFIC QUANTITATIVE TRAIT ASSOCIATIONS FOR CLINICAL OUTCOME RELEVANCE IN A BIOREPOSITORY LINKED TO ELECTRONIC HEALTH RECORDS: LPA AND MYOCARDIAL INFARCTION IN AFRICAN AMERICANS.Pac Symp Biocomput. 2016;21:96-107. Pac Symp Biocomput. 2016. PMID: 26776177 Free PMC article.
-
Commentary: The causal role of gastroesophageal reflux disease in endometriosis: a bidirectional Mendelian randomization study.Front Med (Lausanne). 2025 May 16;12:1522085. doi: 10.3389/fmed.2025.1522085. eCollection 2025. Front Med (Lausanne). 2025. PMID: 40454144 Free PMC article. No abstract available.
-
The Epigenetics of Psychosis: A Structured Review with Representative Loci.Biomedicines. 2022 Feb 28;10(3):561. doi: 10.3390/biomedicines10030561. Biomedicines. 2022. PMID: 35327363 Free PMC article. Review.
-
Genome-wide association study as a powerful tool for dissecting competitive traits in legumes.Front Plant Sci. 2023 Aug 14;14:1123631. doi: 10.3389/fpls.2023.1123631. eCollection 2023. Front Plant Sci. 2023. PMID: 37645459 Free PMC article. Review.
References
-
- Banda Y., Kvale M. N., Hoffmann T. J., Hesselson S. E., Ranatunga D., Tang H., et al. (2015). Characterizing race/ethnicity and genetic ancestry for 100,000 subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort. Genetics 200 1285–1295. 10.1534/genetics.115.178616 - DOI - PMC - PubMed
-
- Burchard E. G., Borrell L. N., Choudhry S., Naqvi M., Tsai H. J., Rodriguez-Santana J. R., et al. (2005). Latino Populations: a unique opportunity for the study of race, genetics, and social environment in epidemiological research. Am. J. Public Health 95 2161–2168. 10.2105/AJPH.2005.068668 - DOI - PMC - PubMed
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources