Logistic regression: a brief primer
- PMID: 21996075
- DOI: 10.1111/j.1553-2712.2011.01185.x
Logistic regression: a brief primer
Abstract
Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model's overall fit to the sample data is assessed using various goodness-of-fit measures, with better fit characterized by a smaller difference between observed and model-predicted values. Use of diagnostic statistics is also recommended to further assess the adequacy of the model. Finally, results for independent variables are typically reported as odds ratios (ORs) with 95% confidence intervals (CIs).
© 2011 by the Society for Academic Emergency Medicine.
Similar articles
-
Understanding data in clinical research: a simple graphical display for plotting data (up to four independent variables) after binary logistic regression analysis.Med Hypotheses. 2004;62(2):228-32. doi: 10.1016/S0306-9877(03)00335-9. Med Hypotheses. 2004. PMID: 14962632
-
A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions.J Clin Epidemiol. 2004 Nov;57(11):1147-52. doi: 10.1016/j.jclinepi.2003.05.003. J Clin Epidemiol. 2004. PMID: 15567630 Review.
-
[Meta-analysis of the Italian studies on short-term effects of air pollution].Epidemiol Prev. 2001 Mar-Apr;25(2 Suppl):1-71. Epidemiol Prev. 2001. PMID: 11515188 Italian.
-
Beyond logistic regression: structural equations modelling for binary variables and its application to investigating unobserved confounders.BMC Med Res Methodol. 2006 Mar 15;6:13. doi: 10.1186/1471-2288-6-13. BMC Med Res Methodol. 2006. PMID: 16539711 Free PMC article.
-
Logistic regression.Methods Mol Biol. 2007;404:273-301. doi: 10.1007/978-1-59745-530-5_14. Methods Mol Biol. 2007. PMID: 18450055 Review.
Cited by
-
Additive interaction between potentially modifiable risk factors and ethnicity among individuals in the Han, Tujia and Miao populations with first-ever ischaemic stroke.BMC Public Health. 2021 Jun 3;21(1):1059. doi: 10.1186/s12889-021-11115-x. BMC Public Health. 2021. PMID: 34082746 Free PMC article.
-
Urinary heavy metals, phthalates, phenols, thiocyanate, parabens, pesticides, polyaromatic hydrocarbons but not arsenic or polyfluorinated compounds are associated with adult oral health: USA NHANES, 2011-2012.Environ Sci Pollut Res Int. 2015 Oct;22(20):15636-45. doi: 10.1007/s11356-015-4749-3. Epub 2015 May 28. Environ Sci Pollut Res Int. 2015. PMID: 26018285
-
Combining Multi-Dimensional Convolutional Neural Network (CNN) With Visualization Method for Detection of Aphis gossypii Glover Infection in Cotton Leaves Using Hyperspectral Imaging.Front Plant Sci. 2021 Feb 15;12:604510. doi: 10.3389/fpls.2021.604510. eCollection 2021. Front Plant Sci. 2021. PMID: 33659014 Free PMC article.
-
Development of decision tree classification algorithms in predicting mortality of COVID-19 patients.Int J Emerg Med. 2024 Sep 27;17(1):126. doi: 10.1186/s12245-024-00681-7. Int J Emerg Med. 2024. PMID: 39333862 Free PMC article.
-
Diverting less urgent utilizers of emergency medical services to primary care: is it feasible? Patient and morbidity characteristics from a cross-sectional multicenter study of self-referring respiratory emergency department consulters.BMC Res Notes. 2021 Mar 24;14(1):113. doi: 10.1186/s13104-021-05517-8. BMC Res Notes. 2021. PMID: 33761978 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous