Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(11):e1003096.
doi: 10.1371/journal.pgen.1003096. Epub 2012 Nov 8.

The Covariate's Dilemma

Affiliations

The Covariate's Dilemma

Joel Mefford et al. PLoS Genet. 2012.
No abstract available

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Impact of—and approaches to—including covariates in the analysis of gene–trait associations.
(a) The covariate C is a confounder associated with both the trait D and the gene G but is not an intermediate on the causal path of interest between G and D. The G–D association should be assessed while controlling C. Omitting C from the analysis of the G–D association can lead to misattribution of a C–D effect to G and false discovery or biased estimates of a G–D effect. (b) The covariate C is independently associated with the trait D but not with gene G (so C is not a confounder). If the trait is quantitative or the study subjects are randomly ascertained, including C in a linear or logistic regression model will increase power to detect the G–D association. (c) If the trait is binary and the subjects are ascertained based on case-control status, the probability of selection (S) depends on G and C and induces a correlation between them. Then including C in a logistic regression model can inflate the G–D association's standard error, reducing power. Omitting C provides the most potential gain in power when C has a strong effect on D, and when D is less common . (d) In Zaitlen et al.'s new approach for evaluating G–D associations with case-control data, a risk model for D is developed from external information about the C–D association and observed C and D levels. Residuals from this model, R, distinguish high- and low-risk cases and controls. Then testing for G–R associations assesses genetic effects unexplained by C in a potentially more powerful manner than conventional logistic regression.

References

    1. Pirinen M, Donnelly P, Spencer CCA (2012) Including known covariates can reduce power to detect genetic effects in case-control studies. Nat Genet 44: 848–851. - PubMed
    1. Robinson LD, Jewell NP (1991) Some surprising results about covariate adjustment in logistic regression models. Int Stat Rev 59: 227–240.
    1. Neuhaus JM, Jewell NP (1993) A geometrical approach to assess bias due to omitted covariates in generalized linear models. Biometrika 80: 807–815.
    1. Neuhaus JM (1998) Estimation efficiency with omitted covariates in generalized linear models. J Am Stat Assoc 93: 1124–1129.
    1. Kuo CL, Feingold E (2010) What's the best statistic for a simple test of genetic association in a case-control study? Genet Epidemiol 34: 246–253. - PubMed

Publication types

MeSH terms