Variable selection strategies and its importance in clinical prediction modelling
- PMID: 32148735
- PMCID: PMC7032893
- DOI: 10.1136/fmch-2019-000262
Variable selection strategies and its importance in clinical prediction modelling
Abstract
Clinical prediction models are used frequently in clinical practice to identify patients who are at risk of developing an adverse outcome so that preventive measures can be initiated. A prediction model can be developed in a number of ways; however, an appropriate variable selection strategy needs to be followed in all cases. Our purpose is to introduce readers to the concept of variable selection in prediction modelling, including the importance of variable selection and variable reduction strategies. We will discuss the various variable selection techniques that can be applied during prediction model building (backward elimination, forward selection, stepwise selection and all possible subset selection), and the stopping rule/selection criteria in variable selection (p values, Akaike information criterion, Bayesian information criterion and Mallows' Cp statistic). This paper focuses on the importance of including appropriate variables, following the proper steps, and adopting the proper methods when selecting variables for prediction models.
Keywords: epidemiology.
© Author(s) (or their employer(s)) 2020. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.
Conflict of interest statement
Competing interests: None declared.
Figures
References
-
- Ratner B. Variable selection methods in regression: Ignorable problem, outing notable solution. Journal of Targeting, Measurement and Analysis for Marketing 2010;18:65–75. 10.1057/jt.2009.26 - DOI
-
- Guyon I, Elisseeff A. An introduction to variable and feature selection. Journal of machine learning research 2003;3:1157–82.
-
- Kuhn M, Johnson K. Applied predictive modeling. New York: Springer, 2013.
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous