Developing well-calibrated illness severity scores for decision support in the critically ill

Christopher V Cosgriff^{1

2}, Leo Anthony Celi^{1

3}, Stephanie Ko⁴, Tejas Sundaresan⁵, Miguel Ángel Armengol de la Hoz^{1

6

7

8}, Aaron Russell Kaufman⁹, David J Stone^{1

10}, Omar Badawi¹¹, Rodrigo Octavio Deliberato^{1

12

13}

Affiliations

¹ 1MIT Critical Data, Laboratory for Computational Physiology, Harvard-MIT Health Sciences & Technology, Massachusetts Institute of Technology, Cambridge, MA 02139 USA.
² 2Department of Medicine, Hospital of the University of Pennsylvania, Philadelphia, PA 19104 USA.
³ 3Division of Pulmonary Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215 USA.
⁴ 4Department of Medicine, National University Health Systems, Singapore, Singapore.
⁵ 5Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139 USA.
⁶ 6Division of Clinical Informatics, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215 USA.
⁷ 7Harvard Medical School, Boston, MA 02115 USA.
⁸ 8Biomedical Engineering and Telemedicine Group, Biomedical Technology Centre CTB, ETSI Telecomunicación, Universidad Politécnica de Madrid, Madrid, 28040 Spain.
⁹ 9Department of Government, Harvard University, Cambridge, MA 02138 USA.
¹⁰ 10Departments of Anesthesiology and Neurosurgery, University of Virginia School of Medicine, Charlottesville, VA 22908 USA.
¹¹ Department of eICU Research and Development, Philips Healthcare, Baltimore, MD 21202 USA.
¹² 12Big Data Department, Hospital Israelita Albert Einstein, São Paulo, Brazil.
¹³ 13Critical Care Department, Hospital Israelita Albert Einstein, São Paulo, Brazil.

PMID: 31428687
PMCID: PMC6695410
DOI: 10.1038/s41746-019-0153-6

Developing well-calibrated illness severity scores for decision support in the critically ill

Christopher V Cosgriff et al. NPJ Digit Med. 2019.

. 2019 Aug 15:2:76.

doi: 10.1038/s41746-019-0153-6. eCollection 2019.

Authors

Affiliations

¹ 1MIT Critical Data, Laboratory for Computational Physiology, Harvard-MIT Health Sciences & Technology, Massachusetts Institute of Technology, Cambridge, MA 02139 USA.
² 2Department of Medicine, Hospital of the University of Pennsylvania, Philadelphia, PA 19104 USA.
³ 3Division of Pulmonary Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215 USA.
⁴ 4Department of Medicine, National University Health Systems, Singapore, Singapore.
⁵ 5Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139 USA.
⁶ 6Division of Clinical Informatics, Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA 02215 USA.
⁷ 7Harvard Medical School, Boston, MA 02115 USA.
⁸ 8Biomedical Engineering and Telemedicine Group, Biomedical Technology Centre CTB, ETSI Telecomunicación, Universidad Politécnica de Madrid, Madrid, 28040 Spain.
⁹ 9Department of Government, Harvard University, Cambridge, MA 02138 USA.
¹⁰ 10Departments of Anesthesiology and Neurosurgery, University of Virginia School of Medicine, Charlottesville, VA 22908 USA.
¹¹ Department of eICU Research and Development, Philips Healthcare, Baltimore, MD 21202 USA.
¹² 12Big Data Department, Hospital Israelita Albert Einstein, São Paulo, Brazil.
¹³ 13Critical Care Department, Hospital Israelita Albert Einstein, São Paulo, Brazil.

PMID: 31428687
PMCID: PMC6695410
DOI: 10.1038/s41746-019-0153-6

Abstract

Illness severity scores are regularly employed for quality improvement and benchmarking in the intensive care unit, but poor generalization performance, particularly with respect to probability calibration, has limited their use for decision support. These models tend to perform worse in patients at a high risk for mortality. We hypothesized that a sequential modeling approach wherein an initial regression model assigns risk and all patients deemed high risk then have their risk quantified by a second, high-risk-specific, regression model would result in a model with superior calibration across the risk spectrum. We compared this approach to a logistic regression model and a sophisticated machine learning approach, the gradient boosting machine. The sequential approach did not have an effect on the receiver operating characteristic curve or the precision-recall curve but resulted in improved reliability curves. The gradient boosting machine achieved a small improvement in discrimination performance and was similarly calibrated to the sequential models.

Keywords: Health care; Medical research; Prognosis.

PubMed Disclaimer

Conflict of interest statement

Competing interestsO.B. is employed by Philips Healthcare. The other authors declare no competing interests.

Figures

**Fig. 1**
Fluxogram. Cohort selection process

**Fig. 2**
Reliability curves for the APACHE IVa and Logit models

**Fig. 3**
Receiver operating characteristic and precision-recall curves. Receiver operating characteristic and precision-recall curves for all the models. Area under the receiver operating characteristic curve (AUC) and average precision (AP) are provided for each model along with 95% confidence intervals obtained from bootstrapping

**Fig. 4**
Reliability curves for sequential models

**Fig. 5**
Reliability curves for the extreme gradient boosting model

See this image and copyright information in PMC

References

1. Breslow MJ, Badawi O. Severity scoring in the critically ill: Part 2: Maximizing value from outcome prediction scoring systems. Chest. 2012;141:518–527. doi: 10.1378/chest.11-0331. - DOI - PubMed
1. Breslow MJ, Badawi O. Severity scoring in the critically ill: part 1–interpretation and accuracy of outcome prediction scoring systems. Chest. 2012;141:245–252. doi: 10.1378/chest.11-0330. - DOI - PubMed
1. Zimmerman JE, Kramer AA, McNair DS, Malila FM. Acute Physiology and Chronic Health Evaluation (APACHE) IV: hospital mortality assessment for today’s critically ill patients. Crit. Care Med. 2006;34:1297–1310. doi: 10.1097/01.CCM.0000215112.84523.F0. - DOI - PubMed
1. Moreno RP, et al. SAPS 3–From evaluation of the patient to evaluation of the intensive care unit. Part 2: Development of a prognostic model for hospital mortality at ICU admission. Intensive Care Med. 2005;31:1345–1355. doi: 10.1007/s00134-005-2763-5. - DOI - PMC - PubMed
1. Vincent JL, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med. 1996;22:707–710. doi: 10.1007/BF01709751. - DOI - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Developing well-calibrated illness severity scores for decision support in the critically ill

Affiliations

Developing well-calibrated illness severity scores for decision support in the critically ill

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources