Robust statistical boosting with quantile-based adaptive loss functions

Jan Speller¹, Christian Staerk¹, Andreas Mayr¹

Affiliations

PMID: 35950232
DOI: 10.1515/ijb-2021-0127

Free article

Robust statistical boosting with quantile-based adaptive loss functions

Jan Speller et al. Int J Biostat. 2022.

Free article

. 2022 Aug 10;19(1):111-129.

doi: 10.1515/ijb-2021-0127. eCollection 2023 May 1.

Authors

Jan Speller¹, Christian Staerk¹, Andreas Mayr¹

Affiliation

¹ Medical Faculty, Institute of Medical Biometrics, Informatics and Epidemiology (IMBIE), University of Bonn, Bonn, Germany.

PMID: 35950232
DOI: 10.1515/ijb-2021-0127

Abstract

We combine robust loss functions with statistical boosting algorithms in an adaptive way to perform variable selection and predictive modelling for potentially high-dimensional biomedical data. To achieve robustness against outliers in the outcome variable (vertical outliers), we consider different composite robust loss functions together with base-learners for linear regression. For composite loss functions, such as the Huber loss and the Bisquare loss, a threshold parameter has to be specified that controls the robustness. In the context of boosting algorithms, we propose an approach that adapts the threshold parameter of composite robust losses in each iteration to the current sizes of residuals, based on a fixed quantile level. We compared the performance of our approach to classical M-regression, boosting with standard loss functions or the lasso regarding prediction accuracy and variable selection in different simulated settings: the adaptive Huber and Bisquare losses led to a better performance when the outcome contained outliers or was affected by specific types of corruption. For non-corrupted data, our approach yielded a similar performance to boosting with the efficient L ₂ loss or the lasso. Also in the analysis of skewed KRT19 protein expression data based on gene expression measurements from human cancer cell lines (NCI-60 cell line panel), boosting with the new adaptive loss functions performed favourably compared to standard loss functions or competing robust approaches regarding prediction accuracy and resulted in very sparse models.

Keywords: Bisquare loss; Huber loss; gradient boosting; robust regression.

PubMed Disclaimer

References

1. Barrios, EB. Robustness, data analysis, and statistical modeling: the first 50 years and beyond. Commun Stat Appl Methods 2015;22:543–56. https://doi.org/10.5351/csam.2015.22.6.543 . - DOI
1. Huber, PJ. Robust statistics . New York: John Wiley & Sons; 1981.
1. Maronna, RA, Martin, RD, Yohai, VJ, Salibián-Barrera, M. Robust statistics: Theory and methods (with R) , 2nd ed. New York: John Wiley & Sons; 2019.
1. Susanti, Y, Pratiwi, H, Sulistijowati, S, Liana, T. M estimation, S estimation, and MM estimation in robust regression. Int J Pure Appl Math 2014;91:349–60. https://doi.org/10.12732/ijpam.v91i3.7 . - DOI
1. Fan, J, Lv, J. A selective overview of variable selection in high dimensional feature space. Stat Sin 2010;20:101–48.

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- De Gruyter
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Robust statistical boosting with quantile-based adaptive loss functions

Affiliation

Robust statistical boosting with quantile-based adaptive loss functions

Authors

Affiliation

Abstract

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous