Bias due to two-stage residual-outcome regression analysis in genetic association studies
- PMID: 21769934
- PMCID: PMC3201714
- DOI: 10.1002/gepi.20607
Bias due to two-stage residual-outcome regression analysis in genetic association studies
Abstract
Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided.
© 2011 Wiley Periodicals, Inc.
Comment in
-
Loss of power in two-stage residual-outcome regression analysis in genetic association studies.Genet Epidemiol. 2012 Dec;36(8):890-4. doi: 10.1002/gepi.21671. Epub 2012 Aug 31. Genet Epidemiol. 2012. PMID: 22941732 Free PMC article. No abstract available.
References
-
- Christenfeld N, Sloan R, Carroll D, Greenland S. Risk Factors, Confounding, and the Illusion of Statistical Control. Psychosomatic Medicine. 2004;66:868–875. - PubMed
-
- Family-Based Association Tests and FBAT-toolkit (user’s manual. 2009. Mar, http://www.biostat.harvard.edu/~fbat/fbat.htm.
-
- Hennekens CH, Buring JE, Mayrent SH. Epidemiology in Medicine. Boston: Little, Brown; 1987.
-
- Hsu YH, Zillikens MC, Wilson SG, Farber CR, Demissie S, Soranzo N, Bianchi EN, Grundberg E, Liang L, Richards JB, Estrada K, Zhou Y, van Nas A, Moffatt MF, Zhai G, Hofman A, van Meurs JB, Pols HA, Price RI, Nilsson O, Pastinen T, Cupples LA, Lusis AJ, Schadt EE, Ferrari S, Uitterlinden AG, Rivadeneira F, Spector TD, Karasik D, Kiel DP. An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility Loci for osteoporosis-related traits. PLoS Genet. 2010;6:e1000977. - PMC - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials
