Resampling to address inequities in predictive modeling of suicide deaths
- PMID: 35396246
- PMCID: PMC8996002
- DOI: 10.1136/bmjhci-2021-100456
Resampling to address inequities in predictive modeling of suicide deaths
Abstract
Objective: Improve methodology for equitable suicide death prediction when using sensitive predictors, such as race/ethnicity, for machine learning and statistical methods.
Methods: Train predictive models, logistic regression, naive Bayes, gradient boosting (XGBoost) and random forests, using three resampling techniques (Blind, Separate, Equity) on emergency department (ED) administrative patient records. The Blind method resamples without considering racial/ethnic group. Comparatively, the Separate method trains disjoint models for each group and the Equity method builds a training set that is balanced both by racial/ethnic group and by class.
Results: Using the Blind method, performance range of the models' sensitivity for predicting suicide death between racial/ethnic groups (a measure of prediction inequity) was 0.47 for logistic regression, 0.37 for naive Bayes, 0.56 for XGBoost and 0.58 for random forest. By building separate models for different racial/ethnic groups or using the equity method on the training set, we decreased the range in performance to 0.16, 0.13, 0.19, 0.20 with Separate method, and 0.14, 0.12, 0.24, 0.13 for Equity method, respectively. XGBoost had the highest overall area under the curve (AUC), ranging from 0.69 to 0.79.
Discussion: We increased performance equity between different racial/ethnic groups and show that imbalanced training sets lead to models with poor predictive equity. These methods have comparable AUC scores to other work in the field, using only single ED administrative record data.
Conclusion: We propose two methods to improve equity of suicide death prediction among different racial/ethnic groups. These methods may be applied to other sensitive characteristics to improve equity in machine learning with healthcare applications.
Keywords: Data Science; Decision Trees; Machine Learning.
© Author(s) (or their employer(s)) 2022. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.
Conflict of interest statement
Competing interests: None declared.
Similar articles
-
Use of Multiprognostic Index Domain Scores, Clinical Data, and Machine Learning to Improve 12-Month Mortality Risk Prediction in Older Hospitalized Patients: Prospective Cohort Study.J Med Internet Res. 2021 Jun 21;23(6):e26139. doi: 10.2196/26139. J Med Internet Res. 2021. PMID: 34152274 Free PMC article.
-
Racial Equity in Healthcare Machine Learning: Illustrating Bias in Models With Minimal Bias Mitigation.Cureus. 2023 Feb 15;15(2):e35037. doi: 10.7759/cureus.35037. eCollection 2023 Feb. Cureus. 2023. PMID: 36942183 Free PMC article.
-
Predicting the risk of diabetes complications using machine learning and social administrative data in a country with ethnic inequities in health: Aotearoa New Zealand.BMC Med Inform Decis Mak. 2024 Sep 27;24(1):274. doi: 10.1186/s12911-024-02678-x. BMC Med Inform Decis Mak. 2024. PMID: 39334279 Free PMC article.
-
A Machine Learning Approach to Predicting Need for Hospitalization for Pediatric Asthma Exacerbation at the Time of Emergency Department Triage.Acad Emerg Med. 2018 Dec;25(12):1463-1470. doi: 10.1111/acem.13655. Epub 2018 Nov 29. Acad Emerg Med. 2018. PMID: 30382605
-
Can machine-learning methods really help predict suicide?Curr Opin Psychiatry. 2020 Jul;33(4):369-374. doi: 10.1097/YCO.0000000000000609. Curr Opin Psychiatry. 2020. PMID: 32250986 Review.
Cited by
-
Operationalising fairness in medical algorithms.BMJ Health Care Inform. 2022 Jun;29(1):e100617. doi: 10.1136/bmjhci-2022-100617. BMJ Health Care Inform. 2022. PMID: 35688512 Free PMC article. No abstract available.
-
Agenda setting for health equity assessment through the lenses of social determinants of health using machine learning approach: a framework and preliminary pilot study.BioData Min. 2025 Feb 10;18(1):14. doi: 10.1186/s13040-025-00428-x. BioData Min. 2025. PMID: 39930525 Free PMC article.
-
Harnessing digital health data for suicide prevention and care: A rapid review.Digit Health. 2025 Feb 23;11:20552076241308615. doi: 10.1177/20552076241308615. eCollection 2025 Jan-Dec. Digit Health. 2025. PMID: 39996066 Free PMC article. Review.
-
Predicting suicide death after emergency department visits with mental health or self-harm diagnoses.Gen Hosp Psychiatry. 2024 Mar-Apr;87:13-19. doi: 10.1016/j.genhosppsych.2024.01.009. Epub 2024 Jan 22. Gen Hosp Psychiatry. 2024. PMID: 38277798 Free PMC article.
-
A call for better validation of opioid overdose risk algorithms.J Am Med Inform Assoc. 2023 Sep 25;30(10):1741-1746. doi: 10.1093/jamia/ocad110. J Am Med Inform Assoc. 2023. PMID: 37428897 Free PMC article.
References
-
- National Institute of Mental Health. Suicide, 2021. Available: https://www.nimh.nih.gov/health/statistics/suicide.shtml [Accessed 13 Jul 2021].
-
- Bhat H, Goldman-Mellor S. Predicting adolescent suicide attempts with neural networks. NIPS 2017 Workshop on Machine Learning for Health (ML4H), 2017. Available: 10057.http://arxiv.org/abs/1711.10057