Cryptic relatedness in epidemiologic collections accessed for genetic association studies: experiences from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study and the National Health and Nutrition Examination Surveys (NHANES)
- PMID: 26579192
- PMCID: PMC4620157
- DOI: 10.3389/fgene.2015.00317
Cryptic relatedness in epidemiologic collections accessed for genetic association studies: experiences from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study and the National Health and Nutrition Examination Surveys (NHANES)
Abstract
Epidemiologic collections have been a major resource for genotype-phenotype studies of complex disease given their large sample size, racial/ethnic diversity, and breadth and depth of phenotypes, traits, and exposures. A major disadvantage of these collections is they often survey households and communities without collecting extensive pedigree data. Failure to account for substantial relatedness can lead to inflated estimates and spurious associations. To examine the extent of cryptic relatedness in an epidemiologic collection, we as the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study accessed the National Health and Nutrition Examination Surveys (NHANES) linked to DNA samples ("Genetic NHANES") from NHANES III and NHANES 1999-2002. NHANES are population-based cross-sectional surveys conducted by the National Center for Health Statistics at the Centers for Disease Control and Prevention. Genome-wide genetic data is not yet available in NHANES, and current data use agreements prohibit the generation of GWAS-level data in NHANES samples due issues in maintaining confidentiality among other ethical concerns. To date, only hundreds of single nucleotide polymorphisms (SNPs) genotyped in a variety of candidate genes are available for analysis in NHANES. We performed identity-by-descent (IBD) estimates in three self-identified subpopulations of Genetic NHANES (non-Hispanic white, non- Hispanic black, and Mexican American) using PLINK software to identify potential familial relationships from presumed unrelated subjects. We then compared the PLINKidentified relationships to those identified by an alternative method implemented in Kinship-based INference for Genome-wide association studies (KING). Overall, both methods identified familial relationships in NHANES III and NHANES 1999-2002 for all three subpopulations, but little concordance was observed between the two methods due in major part to the limited SNP data available in Genetic NHANES. Despite the lack of genome-wide data, our results suggest the presence of cryptic relatedness in this epidemiologic collection and highlight the limitations of restricted datasets such as NHANES in the context of modern day genetic epidemiology studies.
Keywords: EAGLE; NHANES; cross-sectional; cryptic relatedness; epidemiology; genetic association study; genetic epidemiology.
Figures

Similar articles
-
Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data.Front Genet. 2016 May 6;7:76. doi: 10.3389/fgene.2016.00076. eCollection 2016. Front Genet. 2016. PMID: 27200085 Free PMC article.
-
Detection of pleiotropy through a Phenome-wide association study (PheWAS) of epidemiologic data as part of the Environmental Architecture for Genes Linked to Environment (EAGLE) study.PLoS Genet. 2014 Dec 4;10(12):e1004678. doi: 10.1371/journal.pgen.1004678. eCollection 2014 Dec. PLoS Genet. 2014. PMID: 25474351 Free PMC article.
-
Lipid trait-associated genetic variation is associated with gallstone disease in the diverse Third National Health and Nutrition Examination Survey (NHANES III).BMC Med Genet. 2013 Nov 21;14:120. doi: 10.1186/1471-2350-14-120. BMC Med Genet. 2013. PMID: 24256507 Free PMC article.
-
Issues in Returning Individual Results from Genome Research Using Population-Based Banked Specimens, with a Focus on the National Health and Nutrition Examination Survey: Workshop Summary.Washington (DC): National Academies Press (US); 2014 Sep 8. Washington (DC): National Academies Press (US); 2014 Sep 8. PMID: 25340224 Free Books & Documents. Review.
-
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
Cited by
-
Comparative genomics and genome-wide SNPs of endangered Eld's deer provide breeder selection for inbreeding avoidance.Sci Rep. 2023 Nov 13;13(1):19806. doi: 10.1038/s41598-023-47014-x. Sci Rep. 2023. PMID: 37957263 Free PMC article.
-
Profiling and Leveraging Relatedness in a Precision Medicine Cohort of 92,455 Exomes.Am J Hum Genet. 2018 May 3;102(5):874-889. doi: 10.1016/j.ajhg.2018.03.012. Am J Hum Genet. 2018. PMID: 29727688 Free PMC article.
-
Analysis of Heritability Using Genome-Wide Data.Curr Protoc Hum Genet. 2016 Oct 11;91:1.30.1-1.30.10. doi: 10.1002/cphg.25. Curr Protoc Hum Genet. 2016. PMID: 27727439 Free PMC article.
-
Frequency of allele variations in the CFTR gene in a Mexican population.BMC Med Genomics. 2021 Nov 5;14(1):262. doi: 10.1186/s12920-021-01111-w. BMC Med Genomics. 2021. PMID: 34740355 Free PMC article.
-
Multi-Omic Approaches to Identify Genetic Factors in Metabolic Syndrome.Compr Physiol. 2021 Dec 29;12(1):3045-3084. doi: 10.1002/cphy.c210010. Compr Physiol. 2021. PMID: 34964118 Free PMC article.
References
-
- Abecasis G. R., Cherny S. S., Cookson W. O. C., Cardon L. R. (2001). GRR: graphical representation of relationship errors. Bioinformatics 17 742–743. - PubMed
-
- Astle W., Balding D. J. (2009). Population structure and cryptic relatedness in genetic association studies. Stat. Sci. 24 451–471. 10.1214/09-STS307 - DOI
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases