Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Feb;131(2):275-87.
doi: 10.1007/s00439-011-1071-0. Epub 2011 Jul 30.

Analysis of family- and population-based samples in cohort genome-wide association studies

Affiliations

Analysis of family- and population-based samples in cohort genome-wide association studies

Ani Manichaikul et al. Hum Genet. 2012 Feb.

Abstract

Cohort studies typically sample unrelated individuals from a population, although family members of index cases may also be recruited to investigate shared familial risk factors. Recruitment of family members may be incomplete or ancillary to the main cohort, resulting in a mixed sample of independent family units, including unrelated singletons and multiplex families. Multiple methods are available to perform genome-wide association (GWA) analysis of binary or continuous traits in families, but it is unclear whether methods known to perform well on ascertained pedigrees, sibships, or trios are appropriate in analysis of a mixed unrelated cohort and family sample. We present simulation studies based on Multi-Ethnic Study of Atherosclerosis (MESA) pedigree structures to compare the performance of several popular methods of GWA analysis for both quantitative and dichotomous traits in cohort studies. We evaluate approaches suitable for analysis of families, and combined the best performing methods with population-based samples either by meta-analysis, or by pooled analysis of family- and population-based samples (mega-analysis), comparing type 1 error and power. We further assess practical considerations, such as availability of software and ability to incorporate covariates in statistical modeling, and demonstrate our recommended approaches through quantitative and binary trait analysis of HDL cholesterol (HDL-C) in 2,553 MESA family- and population-based African-American samples. Our results suggest linear modeling approaches that accommodate family-induced phenotypic correlation (e.g., variance-component model for quantitative traits or generalized estimating equations for dichotomous traits) perform best in the context of combined family- and population-based cohort GWAS.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Comparison of type I error rate and power for quantitative trait analysis, when the minor allele frequency (MAF) is 0.3
(A) Type I error rate at significance level 0.01, (B) type I error rate at significance level 0.001, and (C) power in quantitative trait analysis of 687 multiplex families. (D) Type I error rate at significance level 0.01, (E) type I error rate at significance level 0.001, and (F) power in quantitative trait analysis 687 multiplex families and 5,922 singletons, with results for analysis of 5,922 singletons alone shown for reference. Uncertainty in point estimates of type I error rates is depicted through 95% confidence intervals constructed by inverting an exact binomial test (Clopper and Pearson 1934).
Figure 2
Figure 2. Comparison of type I error rate and power for binary trait analysis, when the minor allele frequency (MAF) is 0.3
(A) Type I error rate at significance level 0.01, (B) type I error rate at significance level 0.001, and (C) power in binary trait analysis of 687 multiplex families. (D) Type I error rate at significance level 0.01, (E) type I error rate at significance level 0.001, and (F) power in binary trait analysis 687 multiplex families and 5,922 singletons, with results for analysis of 5,922 singletons alone shown for reference. Uncertainty in point estimates of type I error rates is depicted through 95% confidence intervals constructed by inverting an exact binomial test (Clopper and Pearson 1934).

Similar articles

Cited by

References

    1. Abecasis GR, Cardon LR, Cookson WO. A general test of association for quantitative traits in nuclear families. Am J Hum Genet. 2000;66:279–292. - PMC - PubMed
    1. Abecasis GR, Cherny SS, Cookson WO, Cardon LR. Merlin--rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet. 2002;30:97–101. - PubMed
    1. Agresti A. Categorical data analysis. 2nd edn. New York: Wiley-Interscience; 2002.
    1. American Heart Association. What Your Cholesterol Levels Mean. What Your Levels Mean. 2010;vol 2011 http://www.americanheart.org/presenter.jhtml?identifier=183)
    1. Aulchenko YS, Struchalin MV, van Duijn CM. ProbABEL package for genome-wide association analysis of imputed data. BMC Bioinformatics. 2010;11:134. - PMC - PubMed

Publication types