The number of subjects per variable required in linear regression analyses
- PMID: 25704724
- DOI: 10.1016/j.jclinepi.2014.12.014
The number of subjects per variable required in linear regression analyses
Abstract
Objectives: To determine the number of independent variables that can be included in a linear regression model.
Study design and setting: We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R(2) of the fitted model.
Results: A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, the standard errors of the regression coefficients were accurately estimated and estimated confidence intervals had approximately the advertised coverage rates. A much higher number of SPV were necessary to minimize bias in estimating the model R(2), although adjusted R(2) estimates behaved well. The bias in estimating the model R(2) statistic was inversely proportional to the magnitude of the proportion of variation explained by the population regression model.
Conclusion: Linear regression models require only two SPV for adequate estimation of regression coefficients, standard errors, and confidence intervals.
Keywords: Bias; Explained variation; Linear regression; Monte Carlo simulations; Regression; Statistical methods.
Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Similar articles
-
Estimating linear regression models in the presence of a censored independent variable.Stat Med. 2004 Feb 15;23(3):411-29. doi: 10.1002/sim.1601. Stat Med. 2004. PMID: 14748036
-
Using the bootstrap to improve estimation and confidence intervals for regression coefficients selected using backwards variable elimination.Stat Med. 2008 Jul 30;27(17):3286-300. doi: 10.1002/sim.3104. Stat Med. 2008. PMID: 17940997
-
The number of primary events per variable affects estimation of the subdistribution hazard competing risks model.J Clin Epidemiol. 2017 Mar;83:75-84. doi: 10.1016/j.jclinepi.2016.11.017. Epub 2017 Jan 12. J Clin Epidemiol. 2017. PMID: 28088594
-
When can group level clustering be ignored? Multilevel models versus single-level models with sparse data.J Epidemiol Community Health. 2008 Aug;62(8):752-8. doi: 10.1136/jech.2007.060798. J Epidemiol Community Health. 2008. PMID: 18621963 Review.
-
Repeatability for Gaussian and non-Gaussian data: a practical guide for biologists.Biol Rev Camb Philos Soc. 2010 Nov;85(4):935-56. doi: 10.1111/j.1469-185X.2010.00141.x. Biol Rev Camb Philos Soc. 2010. PMID: 20569253 Review.
Cited by
-
A single unified model for fitting simple to complex receptor response data.Sci Rep. 2020 Aug 7;10(1):13386. doi: 10.1038/s41598-020-70220-w. Sci Rep. 2020. PMID: 32770075 Free PMC article.
-
Predictive variables of prescription opioid misuse in patients with chronic noncancer pain. Development of a risk detection scale: A registered report protocol.PLoS One. 2021 May 13;16(5):e0251586. doi: 10.1371/journal.pone.0251586. eCollection 2021. PLoS One. 2021. PMID: 33984037 Free PMC article.
-
Most published meta-regression analyses based on aggregate data suffer from methodological pitfalls: a meta-epidemiological study.BMC Med Res Methodol. 2021 Jun 15;21(1):123. doi: 10.1186/s12874-021-01310-0. BMC Med Res Methodol. 2021. PMID: 34130658 Free PMC article.
-
Correlation of Anterior Interbody Graft Choice With Patient-Reported Outcomes in Cervical Spine Trauma.Global Spine J. 2019 Oct;9(7):735-742. doi: 10.1177/2192568219828720. Epub 2019 Feb 5. Global Spine J. 2019. PMID: 31552155 Free PMC article.
-
The association between patients' illness perceptions and longitudinal clinical outcome in patients with low back pain.Pain Rep. 2022 Apr 27;7(3):e1004. doi: 10.1097/PR9.0000000000001004. eCollection 2022 May-Jun. Pain Rep. 2022. PMID: 35505791 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources