Prediction-based variable selection for component-wise gradient boosting
- PMID: 38000054
- DOI: 10.1515/ijb-2023-0052
Prediction-based variable selection for component-wise gradient boosting
Abstract
Model-based component-wise gradient boosting is a popular tool for data-driven variable selection. In order to improve its prediction and selection qualities even further, several modifications of the original algorithm have been developed, that mainly focus on different stopping criteria, leaving the actual variable selection mechanism untouched. We investigate different prediction-based mechanisms for the variable selection step in model-based component-wise gradient boosting. These approaches include Akaikes Information Criterion (AIC) as well as a selection rule relying on the component-wise test error computed via cross-validation. We implemented the AIC and cross-validation routines for Generalized Linear Models and evaluated them regarding their variable selection properties and predictive performance. An extensive simulation study revealed improved selection properties whereas the prediction error could be lowered in a real world application with age-standardized COVID-19 incidence rates.
Keywords: gradient boosting; high-dimensional data; prediction analysis; sparse models; variable selection.
© 2023 Walter de Gruyter GmbH, Berlin/Boston.
Similar articles
-
The importance of knowing when to stop. A sequential stopping rule for component-wise gradient boosting.Methods Inf Med. 2012;51(2):178-86. doi: 10.3414/ME11-02-0030. Epub 2012 Feb 20. Methods Inf Med. 2012. PMID: 22344292
-
Randomized boosting with multivariable base-learners for high-dimensional variable selection and prediction.BMC Bioinformatics. 2021 Sep 16;22(1):441. doi: 10.1186/s12859-021-04340-z. BMC Bioinformatics. 2021. PMID: 34530737 Free PMC article.
-
Deselection of base-learners for statistical boosting-with an application to distributional regression.Stat Methods Med Res. 2022 Feb;31(2):207-224. doi: 10.1177/09622802211051088. Epub 2021 Dec 9. Stat Methods Med Res. 2022. PMID: 34882438
-
Extending statistical boosting. An overview of recent methodological developments.Methods Inf Med. 2014;53(6):428-35. doi: 10.3414/ME13-01-0123. Epub 2014 Aug 12. Methods Inf Med. 2014. PMID: 25112429 Review.
-
An Update on Statistical Boosting in Biomedicine.Comput Math Methods Med. 2017;2017:6083072. doi: 10.1155/2017/6083072. Epub 2017 Aug 2. Comput Math Methods Med. 2017. PMID: 28831290 Free PMC article. Review.
References
-
- Bühlmann, P, Hothorn, T. Boosting Algorithms: Regularization, Prediction and Model Fitting. Stat Sci 2007;22:477–505. https://doi.org/10.1214/07-sts242 . - DOI
-
- Freund, Y, Schapire, R. Experiments with a new boosting algorithm. In: Proceedings of the thirteenth international conference on machine learning theory ; 1996:148–56 pp.
-
- Breiman, L. Arcing the edge . Berkeley: Statistics Department, University of California at Berkeley; 1997:1–14 pp.
-
- Friedman, J, Hastie, T, Tibshirani, R. Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 2000;28:337–407. https://doi.org/10.1214/aos/1016218223 . - DOI
-
- Friedman, J. Greedy function approximation: A gradient boosting machine. Ann Stat 2001;29:1189–232. https://doi.org/10.1214/aos/1013203451 . - DOI
MeSH terms
LinkOut - more resources
Full Text Sources
Medical