Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Oct 26:6:317.
doi: 10.3389/fgene.2015.00317. eCollection 2015.

Cryptic relatedness in epidemiologic collections accessed for genetic association studies: experiences from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study and the National Health and Nutrition Examination Surveys (NHANES)

Affiliations

Cryptic relatedness in epidemiologic collections accessed for genetic association studies: experiences from the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study and the National Health and Nutrition Examination Surveys (NHANES)

Jennifer Malinowski et al. Front Genet. .

Abstract

Epidemiologic collections have been a major resource for genotype-phenotype studies of complex disease given their large sample size, racial/ethnic diversity, and breadth and depth of phenotypes, traits, and exposures. A major disadvantage of these collections is they often survey households and communities without collecting extensive pedigree data. Failure to account for substantial relatedness can lead to inflated estimates and spurious associations. To examine the extent of cryptic relatedness in an epidemiologic collection, we as the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) study accessed the National Health and Nutrition Examination Surveys (NHANES) linked to DNA samples ("Genetic NHANES") from NHANES III and NHANES 1999-2002. NHANES are population-based cross-sectional surveys conducted by the National Center for Health Statistics at the Centers for Disease Control and Prevention. Genome-wide genetic data is not yet available in NHANES, and current data use agreements prohibit the generation of GWAS-level data in NHANES samples due issues in maintaining confidentiality among other ethical concerns. To date, only hundreds of single nucleotide polymorphisms (SNPs) genotyped in a variety of candidate genes are available for analysis in NHANES. We performed identity-by-descent (IBD) estimates in three self-identified subpopulations of Genetic NHANES (non-Hispanic white, non- Hispanic black, and Mexican American) using PLINK software to identify potential familial relationships from presumed unrelated subjects. We then compared the PLINKidentified relationships to those identified by an alternative method implemented in Kinship-based INference for Genome-wide association studies (KING). Overall, both methods identified familial relationships in NHANES III and NHANES 1999-2002 for all three subpopulations, but little concordance was observed between the two methods due in major part to the limited SNP data available in Genetic NHANES. Despite the lack of genome-wide data, our results suggest the presence of cryptic relatedness in this epidemiologic collection and highlight the limitations of restricted datasets such as NHANES in the context of modern day genetic epidemiology studies.

Keywords: EAGLE; NHANES; cross-sectional; cryptic relatedness; epidemiology; genetic association study; genetic epidemiology.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Percent concordance of familial relationships identified by KING compared with PLINK by survey. KING relationships were considered concordant if also identified by PLINK (expressed as percent on the y-axis). We only consider close relationships here (monozygotic or MZ twins, duplicate samples, and parent-offspring) in estimating concordance given the limited SNP data and the lack of resolution expected with these data. The data are displayed by Genetic NHANES (x-axis) and stratified by estimated familial relationship and race/ethnicity.

Similar articles

Cited by

References

    1. Abecasis G. R., Cherny S. S., Cookson W. O. C., Cardon L. R. (2001). GRR: graphical representation of relationship errors. Bioinformatics 17 742–743. - PubMed
    1. Anderson G. L., Manson J., Wallace R., Lund B., Hall D., Davis S., et al. (2003). Implementation of the women’s health initiative study design. Ann. Epidemiol. 13 S5–S17. 10.1016/S1047-2797(03)00043-7 - DOI - PubMed
    1. Astle W., Balding D. J. (2009). Population structure and cryptic relatedness in genetic association studies. Stat. Sci. 24 451–471. 10.1214/09-STS307 - DOI
    1. Boehnke M., Cox N. J. (1997). Accurate inference of relationships in sib-pair linkage studies. Am. J. Hum. Genet. 61 423–429. 10.1086/514862 - DOI - PMC - PubMed
    1. Butler J. M. (2007). Short tandem repeat typing technologies used in human identify testing. Biotechniques 43 II–V 10.2144/000112582 - DOI - PubMed