Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(11):e1003032.
doi: 10.1371/journal.pgen.1003032. Epub 2012 Nov 8.

Informed conditioning on clinical covariates increases power in case-control association studies

Affiliations

Informed conditioning on clinical covariates increases power in case-control association studies

Noah Zaitlen et al. PLoS Genet. 2012.

Abstract

Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low-BMI cases are larger than those estimated from high-BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-control-covariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control ascertainment and fail to capture available power increases under case-control-covariate ascertainment. We show that an informed conditioning approach, based on the liability threshold model with parameters informed by external epidemiological information, fully accounts for disease prevalence and non-random ascertainment of phenotype as well as covariates and provides a substantial increase in power while maintaining a properly controlled false-positive rate. Our method outperforms standard case-control association tests with or without covariates, tests of gene x covariate interaction, and previously proposed tests for dealing with covariates in ascertained data, with especially large improvements in the case of case-control-covariate ascertainment. We investigate empirical case-control studies of type 2 diabetes, prostate cancer, lung cancer, breast cancer, rheumatoid arthritis, age-related macular degeneration, and end-stage kidney disease over a total of 89,726 samples. In these datasets, informed conditioning outperforms logistic regression for 115 of the 157 known associated variants investigated (P-value = 1 × 10(-9)). The improvement varied across diseases with a 16% median increase in χ(2) test statistics and a commensurate increase in power. This suggests that applying our method to existing and future association studies of these diseases may identify novel disease loci.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Illustration of liability threshold model: simulated T2D example.
The posterior mean of ε for low-BMI and high-BMI cases is the expected value of ε given that it exceeds c(tformula image)+m. High-BMI cases have a lower posterior mean relative to low-BMI cases since they require a smaller contribution from genetics to exceed the threshold in the liability threshold model.
Figure 2
Figure 2. Power calculations for LogR, G+GxE, and LT approaches in simulated data.
For each statistic we display power to attain P<5×10−8 based on 1,000,000 simulations of 3000 cases and 3000 controls, for various effect sizes γ. The increase in power (ratio of y-axis values) for LT versus LogR is 22.8% for γ = 0.1, and 23.0% when computing average power across all values of γ. For γ = 0 the power was 5.0% for all statistics when the P-value threshold is 0.05. G+GxE performs worse due to an extra degree of freedom.

References

    1. Voight BF, Scott LJ, Steinthorsdottir V, Morris AP, Dina C, et al. (2010) Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat Genet 42: 579–589. - PMC - PubMed
    1. Freedman ML, Haiman CA, Patterson N, McDonald GJ, Tandon A, et al. (2006) Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men. Proc Natl Acad Sci U S A 103: 14068–14073. - PMC - PubMed
    1. Kote-Jarai Z, Olama AA, Giles GG, Severi G, Schleutker J, et al. (2011) Seven prostate cancer susceptibility loci identified by a multi-stage genome-wide association study. Nat Genet 43: 785–791. - PMC - PubMed
    1. Ellis KL, Pilbrow AP, Frampton CM, Doughty RN, Whalley GA, et al. (2010) A common variant at chromosome 9P21.3 is associated with age of onset of coronary disease but not subsequent mortality. Circ Cardiovasc Genet 3: 286–293. - PubMed
    1. Imielinski M, Baldassano RN, Griffiths A, Russell RK, Annese V, et al. (2009) Common variants at five new loci associated with early-onset inflammatory bowel disease. Nat Genet 41: 1335–1340. - PMC - PubMed

Publication types