Observational Study

. 2021 Apr;3(4):e241-e249.

doi: 10.1016/S2589-7500(21)00022-4.

Performance of intensive care unit severity scoring systems across different ethnicities in the USA: a retrospective observational study

Rahuldeb Sarkar¹, Christopher Martin², Heather Mattie³, Judy Wawira Gichoya⁴, David J Stone⁵, Leo Anthony Celi⁶

Affiliations

¹ Department of Respiratory Medicine, Medway NHS Foundation Trust, Gillingham, Kent, UK; Department of Critical Care, Medway NHS Foundation Trust, Gillingham, Kent, UK; Faculty of Life Sciences, King's College London, London, UK.
² UCL Institute for Health Informatics, London, UK; Crystallise, Essex, UK.
³ Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA.
⁴ Interventional Radiology and Informatics, Department of Radiology and Imaging Sciences, Emory University, Atlanta, GA, USA.
⁵ Department of Anesthesiology, University of Virginia School of Medicine, Charlottesville, VA, USA; Department of Neurosurgery, University of Virginia School of Medicine, Charlottesville, VA, USA; Center for Advanced Medical Analytics, University of Virginia School of Medicine, Charlottesville, VA, USA.
⁶ Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA; Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA. Electronic address: lceli@mit.edu.

PMID: 33766288
PMCID: PMC8063502
DOI: 10.1016/S2589-7500(21)00022-4

Observational Study

Performance of intensive care unit severity scoring systems across different ethnicities in the USA: a retrospective observational study

Rahuldeb Sarkar et al. Lancet Digit Health. 2021 Apr.

. 2021 Apr;3(4):e241-e249.

doi: 10.1016/S2589-7500(21)00022-4.

Authors

Rahuldeb Sarkar¹, Christopher Martin², Heather Mattie³, Judy Wawira Gichoya⁴, David J Stone⁵, Leo Anthony Celi⁶

Affiliations

¹ Department of Respiratory Medicine, Medway NHS Foundation Trust, Gillingham, Kent, UK; Department of Critical Care, Medway NHS Foundation Trust, Gillingham, Kent, UK; Faculty of Life Sciences, King's College London, London, UK.
² UCL Institute for Health Informatics, London, UK; Crystallise, Essex, UK.
³ Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA.
⁴ Interventional Radiology and Informatics, Department of Radiology and Imaging Sciences, Emory University, Atlanta, GA, USA.
⁵ Department of Anesthesiology, University of Virginia School of Medicine, Charlottesville, VA, USA; Department of Neurosurgery, University of Virginia School of Medicine, Charlottesville, VA, USA; Center for Advanced Medical Analytics, University of Virginia School of Medicine, Charlottesville, VA, USA.
⁶ Department of Biostatistics, Harvard T H Chan School of Public Health, Boston, MA, USA; Laboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA. Electronic address: lceli@mit.edu.

PMID: 33766288
PMCID: PMC8063502
DOI: 10.1016/S2589-7500(21)00022-4

Abstract

Background: Despite wide use of severity scoring systems for case-mix determination and benchmarking in the intensive care unit (ICU), the possibility of scoring bias across ethnicities has not been examined. Guidelines on the use of illness severity scores to inform triage decisions for allocation of scarce resources, such as mechanical ventilation, during the current COVID-19 pandemic warrant examination for possible bias in these models. We investigated the performance of the severity scoring systems Acute Physiology and Chronic Health Evaluation IVa (APACHE IVa), Oxford Acute Severity of Illness Score (OASIS), and Sequential Organ Failure Assessment (SOFA) across four ethnicities in two large ICU databases to identify possible ethnicity-based bias.

Methods: Data from the electronic ICU Collaborative Research Database (eICU-CRD) and the Medical Information Mart for Intensive Care III (MIMIC-III) database, built from patient episodes in the USA from 2014-15 and 2001-12, respectively, were analysed for score performance in Asian, Black, Hispanic, and White people after appropriate exclusions. Hospital mortality was the outcome of interest. Discrimination and calibration were determined for all three scoring systems in all four groups, using area under receiver operating characteristic (AUROC) curve for different ethnicities to assess discrimination, and standardised mortality ratio (SMR) or proxy measures to assess calibration.

Findings: We analysed 166 751 participants (122 919 eICU-CRD and 43 832 MIMIC-III). Although measurements of discrimination were significantly different among the groups (AUROC ranging from 0·86 to 0·89 [p=0·016] with APACHE IVa and from 0·75 to 0·77 [p=0·85] with OASIS), they did not display any discernible systematic patterns of bias. However, measurements of calibration indicated persistent, and in some cases statistically significant, patterns of difference between Hispanic people (SMR 0·73 with APACHE IVa and 0·64 with OASIS) and Black people (0·67 and 0·68) versus Asian people (0·77 and 0·95) and White people (0·76 and 0·81). Although calibrations were imperfect for all groups, the scores consistently showed a pattern of overpredicting mortality for Black people and Hispanic people. Similar results were seen using SOFA scores across the two databases.

Interpretation: The systematic differences in calibration across ethnicities suggest that illness severity scores reflect statistical bias in their predictions of mortality.

Funding: There was no specific funding for this study.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests

RS received writing fees for health-care reports from Crystallise UK.

CM is a director for Crystallise UK. HM, JWG, DJS, and LAC declare no competing interests.

Figures

**Figure 1:. Study flow**
Excluded patients in both databases; the exclusions have been made in the sequence specified in the diagram. APACHE IVa=Acute Physiology and Chronic Health Evaluation IVa. eICU-CRD=electronic intensive care unit Collaborative Research Database. MIMIC-III=Medical Information Mart for Intensive Care III. SOFA=Sequential Organ Failure Assessment.

**Figure 2:. ROC for predicted hospital mortality by ethnicity in the eICU-CRD**
(A) ROC for the APACHE IVa-predicted hospital mortality in the eICU-CRD by ethnicity. The AUROC for all was 0·86, Hispanic 0·89, Black 0·87, White 0·86, and Asian 0·86. (B) ROC for the OASIS-predicted hospital mortality in the MIMIC-III database by ethnicity. The AUROC for all was 0·76, Hispanic 0·76, Black 0·75, White 0·76, and Asian 0·77. ROC=receiver operating curve. APACHE IVa=Acute Physiology and Chronic Health Evaluation scoring system IVa. eICU-CRD=electronic intensive care unit Collaborative Research Database. AUROC=area under receiver operating characteristic. OASIS=Oxford Acute Severity of Illness Score. MIMIC-III=Medical Information Mart for Intensive Care III.

**Figure 3:. SMR for APACHE score in the eICU-CRD and OASIS score in MIMIC-III across ethnicities**
(A) Forest plot for SMRs from the eICU-CRD for mortality predicted by APACHE IVa. (B) Forest plot for SMRs for different ethnicities from the MIMIC-III database for predicted mortality determined by OASIS. SMR=standardised mortality ratio. eICU-CRD=electronic intensive care unit Collaborative Research Database. APACHE IVa=Acute Physiology and Chronic Health Evaluation scoring system IVa. OASIS=Oxford Acute Severity of Illness Score.

**Figure 4:. ROC curves for forest plots for predicted mortality by SOFA score in eICU-CRD and MIMIC-III**
(A) ROC plots for all ethnicities in the eICU-CRD for SOFA score performance in hospital mortality prediction. The AUROC for all was 0·77, Hispanic 0·78, Black 0·79, White 0·77, and Asian 0·72. (B) ROC plots for all ethnicities in MIMIC-III for SOFA score performance in hospital mortality prediction. The AUROC for all was 0·73, Hispanic 0·74, Black 0·76, White 0·73, and Asian 0·73. (C) Forest plot for AUROCs in different ethnicities in the eICU-CRD for performance of SOFA score with 95% CIs. (D) Forest plot for AUROCs in different ethnicities in MIMIC-III for performance of SOFA score with 95% CIs. AUC=area under the curve. AUROC=area under receiver operating characteristic. eICU-CRD=electronic intensive care unit Collaborative Research Database. MIMIC-III=Medical Information Mart for Intensive Care-III. SOFA=Sequential Organ Failure and Assessment.

See this image and copyright information in PMC

Update of

Performance of intensive care unit severity scoring systems across different ethnicities.
Sarkar R, Martin C, Mattie H, Gichoya JW, Stone DJ, Celi LA. Sarkar R, et al. medRxiv [Preprint]. 2021 Jan 20:2021.01.19.21249222. doi: 10.1101/2021.01.19.21249222. medRxiv. 2021. Update in: Lancet Digit Health. 2021 Apr;3(4):e241-e249. doi: 10.1016/S2589-7500(21)00022-4. PMID: 33501459 Free PMC article. Updated. Preprint.

Comment in

Ethnicity-based bias in clinical severity scores.
Gumbsch T, Borgwardt K. Gumbsch T, et al. Lancet Digit Health. 2021 Apr;3(4):e209-e210. doi: 10.1016/S2589-7500(21)00044-3. Lancet Digit Health. 2021. PMID: 33766286 No abstract available.

References

1. Vincent JL, Moreno R. Clinical review: scoring systems in the critically ill. Crit Care 2010; 14: 207. - PMC - PubMed
1. Poncet A, Perneger TV, Merlani P, Capuzzo M, Combescure C. Determinants of the calibration of SAPS II and SAPS 3 mortality scores in intensive care: a European multicenter study. Crit Care 2017; 21: 85. - PMC - PubMed
1. Quindemil K, Nagl-Cupal M, Anderson KH, Mayer H. Migrant and minority family members in the intensive care unit. A review of the literature. HeilberufeScience 2013; 4: 128–35. - PMC - PubMed
1. Orlovic M, Smith K, Mossialos E. Racial and ethnic differences in end-of-life care in the United States: evidence from the Health and Retirement Study (HRS). SSM Popul Health 2018; 7: 100331. - PMC - PubMed
1. Wiens J, Price WN 2nd, Sjoding MW. Diagnosing bias in data-driven algorithms for healthcare. Nat Med 2020; 26: 25–26. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Performance of intensive care unit severity scoring systems across different ethnicities in the USA: a retrospective observational study

Affiliations

Performance of intensive care unit severity scoring systems across different ethnicities in the USA: a retrospective observational study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

Comment in

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources