. 2024 Aug:156:104677.

doi: 10.1016/j.jbi.2024.104677. Epub 2024 Jun 13.

Assessing fairness in machine learning models: A study of racial bias using matched counterparts in mortality prediction for patients with chronic diseases

Yifei Wang¹, Liqin Wang², Zhengyang Zhou¹, John Laurentiev³, Joshua R Lakin⁴, Li Zhou², Pengyu Hong⁵

Affiliations

¹ Brandeis University, Waltham, MA, USA.
² Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.
³ Brigham and Women's Hospital, Boston, MA, USA.
⁴ Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, MA, USA.
⁵ Brandeis University, Waltham, MA, USA. Electronic address: hongpeng@brandeis.edu.

PMID: 38876453
PMCID: PMC11272432
DOI: 10.1016/j.jbi.2024.104677

Assessing fairness in machine learning models: A study of racial bias using matched counterparts in mortality prediction for patients with chronic diseases

Yifei Wang et al. J Biomed Inform. 2024 Aug.

. 2024 Aug:156:104677.

doi: 10.1016/j.jbi.2024.104677. Epub 2024 Jun 13.

Authors

Yifei Wang¹, Liqin Wang², Zhengyang Zhou¹, John Laurentiev³, Joshua R Lakin⁴, Li Zhou², Pengyu Hong⁵

Affiliations

¹ Brandeis University, Waltham, MA, USA.
² Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.
³ Brigham and Women's Hospital, Boston, MA, USA.
⁴ Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Department of Psychosocial Oncology and Palliative Care, Dana-Farber Cancer Institute, Boston, MA, USA.
⁵ Brandeis University, Waltham, MA, USA. Electronic address: hongpeng@brandeis.edu.

PMID: 38876453
PMCID: PMC11272432
DOI: 10.1016/j.jbi.2024.104677

Abstract

Objective: Existing approaches to fairness evaluation often overlook systematic differences in the social determinants of health, like demographics and socioeconomics, among comparison groups, potentially leading to inaccurate or even contradictory conclusions. This study aims to evaluate racial disparities in predicting mortality among patients with chronic diseases using a fairness detection method that considers systematic differences.

Methods: We created five datasets from Mass General Brigham's electronic health records (EHR), each focusing on a different chronic condition: congestive heart failure (CHF), chronic kidney disease (CKD), chronic obstructive pulmonary disease (COPD), chronic liver disease (CLD), and dementia. For each dataset, we developed separate machine learning models to predict 1-year mortality and examined racial disparities by comparing prediction performances between Black and White individuals. We compared racial fairness evaluation between the overall Black and White individuals versus their counterparts who were Black and matched White individuals identified by propensity score matching, where the systematic differences were mitigated.

Results: We identified significant differences between Black and White individuals in age, gender, marital status, education level, smoking status, health insurance type, body mass index, and Charlson comorbidity index (p-value < 0.001). When examining matched Black and White subpopulations identified through propensity score matching, significant differences between particular covariates existed. We observed weaker significance levels in the CHF cohort for insurance type (p = 0.043), in the CKD cohort for insurance type (p = 0.005) and education level (p = 0.016), and in the dementia cohort for body mass index (p = 0.041); with no significant differences for other covariates. When examining mortality prediction models across the five study cohorts, we conducted a comparison of fairness evaluations before and after mitigating systematic differences. We revealed significant differences in the CHF cohort with p-values of 0.021 and 0.001 in terms of F1 measure and Sensitivity for the AdaBoost model, and p-values of 0.014 and 0.003 in terms of F1 measure and Sensitivity for the MLP model, respectively.

Discussion and conclusion: This study contributes to research on fairness assessment by focusing on the examination of systematic disparities and underscores the potential for revealing racial bias in machine learning models used in clinical settings.

Keywords: Chronic Disease; Electronic Health Records; Fairness Analysis; Machine Learning; Mortality Prediction; Racism.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Declaration of generative AI and AI-assisted technologies in the writing process Statement: During the preparation of this work the author(s) used ChatGPT in order to improve writing. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Figures

**Figure 1.**
Training-validation data splitting method with counterparts. The total population is initially divided into three distinct parts: Black, White-matched, and Others. The Black and White-matched groups serve as counterparts. The “Others” group comprises the remaining population, including non-matched White individuals and those of races other than Black or White. The splitting method for the matched White population mirrors that of the Black population to maintain counterpart matching. The “Others” group is splitted using randomization. These segments, labeled as B1, M1 and O1, are concatenated to form the final training set.

**Figure 2.**
Performance comparison on Black, White-matched, White-total, and the total population. Five datasets are studied, each focusing on a different chronic condition: congestive heart failure (CHF), chronic kidney disease (CKD), chronic obstructive pulmonary disease (COPD), chronic liver disease (CLD), and dementia. For each chronic disease group, four types of 1-year mortality prediction models are compared, including Logistic Regression (LR), Random Forest (RF), AdaBoost (Ada), and Multilayer Perceptron (MLP). The model performance is assessed using sensitivity, F1 score, and AUROC.

**Figure 3.**
Comparison of fairness evaluation methods on CHF and Dementia Cohorts. For each type of disease, two models (Ada and MLP) and two fairness assessments ( $Δ_{f 1}$ and $Δ_{S e}$ ) are considered. Each scatter plot compares the results of fairness evaluation between counterparts (i.e., Black vs White-matched; x-axis) and fairness evaluation between overall groups (i.e., Black vs. White-total; y-axis) from five repeated experiments. Each point on the scatter plot represents a pair of results from the same experiment. The p-values are paired t-tests that are conducted to assess the significance of differences between two fairness evaluation methods.

See this image and copyright information in PMC

Cited by

Towards machine learning fairness in classifying multicategory causes of deaths in colorectal or lung cancer patients.
Feng CH, Deng F, Disis ML, Gao N, Zhang L. Feng CH, et al. bioRxiv [Preprint]. 2025 Feb 19:2025.02.14.638368. doi: 10.1101/2025.02.14.638368. bioRxiv. 2025. Update in: Brief Bioinform. 2025 Jul 2;26(4):bbaf398. doi: 10.1093/bib/bbaf398. PMID: 40027644 Free PMC article. Updated. Preprint.
Electronic Health Record Phenotyping of Pediatric Suicide-Related Emergency Department Visits.
Edgcomb JB, Olde Loohuis L, Tseng CH, Klomhaus AM, Choi KR, Ponce CG, Zima BT. Edgcomb JB, et al. JAMA Netw Open. 2024 Oct 1;7(10):e2442091. doi: 10.1001/jamanetworkopen.2024.42091. JAMA Netw Open. 2024. PMID: 39470636 Free PMC article.
Towards machine learning fairness in classifying multicategory causes of deaths in colorectal or lung cancer patients.
Feng CH, Deng F, Disis ML, Gao N, Zhang L. Feng CH, et al. Brief Bioinform. 2025 Jul 2;26(4):bbaf398. doi: 10.1093/bib/bbaf398. Brief Bioinform. 2025. PMID: 40794953 Free PMC article.
Biases in Artificial Intelligence Application in Pain Medicine.
Jumreornvong O, Perez AM, Malave B, Mozawalla F, Kia A, Nwaneshiudu CA. Jumreornvong O, et al. J Pain Res. 2025 Feb 28;18:1021-1033. doi: 10.2147/JPR.S495934. eCollection 2025. J Pain Res. 2025. PMID: 40041672 Free PMC article. Review.

References

1. Fiscella K, Sanders MR. Racial and ethnic disparities in the quality of health care. Annual review of public health. 2016;37:375–94. - PubMed
1. Flores G, Research CoP. Racial and ethnic disparities in the health and health care of children. Pediatrics. 2010;125(4):e979–e1020. - PubMed
1. National Academies of Sciences E, Medicine. Communities in action: Pathways to health equity. 2017. - PubMed
1. Siddiqi AA, Wang S, Quinn K, Nguyen QC, Christy AD. Racial disparities in access to care under conditions of universal coverage. American journal of preventive medicine. 2016;50(2):220–5. - PubMed
1. Wheeler SM, Bryant AS. Racial and ethnic disparities in health and health care. Obstetrics and Gynecology Clinics. 2017;44(1):1–11. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 LM014239/LM/NLM NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Elsevier Science
- PubMed Central
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Assessing fairness in machine learning models: A study of racial bias using matched counterparts in mortality prediction for patients with chronic diseases

Affiliations

Assessing fairness in machine learning models: A study of racial bias using matched counterparts in mortality prediction for patients with chronic diseases

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Medical