Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2022 Dec;35(6):1514-1529.
doi: 10.1007/s10278-022-00674-z. Epub 2022 Jul 5.

Developing and Validating Multi-Modal Models for Mortality Prediction in COVID-19 Patients: a Multi-center Retrospective Study

Affiliations
Multicenter Study

Developing and Validating Multi-Modal Models for Mortality Prediction in COVID-19 Patients: a Multi-center Retrospective Study

Joy Tzung-Yu Wu et al. J Digit Imaging. 2022 Dec.

Abstract

The unprecedented global crisis brought about by the COVID-19 pandemic has sparked numerous efforts to create predictive models for the detection and prognostication of SARS-CoV-2 infections with the goal of helping health systems allocate resources. Machine learning models, in particular, hold promise for their ability to leverage patient clinical information and medical images for prediction. However, most of the published COVID-19 prediction models thus far have little clinical utility due to methodological flaws and lack of appropriate validation. In this paper, we describe our methodology to develop and validate multi-modal models for COVID-19 mortality prediction using multi-center patient data. The models for COVID-19 mortality prediction were developed using retrospective data from Madrid, Spain (N = 2547) and were externally validated in patient cohorts from a community hospital in New Jersey, USA (N = 242) and an academic center in Seoul, Republic of Korea (N = 336). The models we developed performed differently across various clinical settings, underscoring the need for a guided strategy when employing machine learning for clinical decision-making. We demonstrated that using features from both the structured electronic health records and chest X-ray imaging data resulted in better 30-day mortality prediction performance across all three datasets (areas under the receiver operating characteristic curves: 0.85 (95% confidence interval: 0.83-0.87), 0.76 (0.70-0.82), and 0.95 (0.92-0.98)). We discuss the rationale for the decisions made at every step in developing the models and have made our code available to the research community. We employed the best machine learning practices for clinical model development. Our goal is to create a toolkit that would assist investigators and organizations in building multi-modal models for prediction, classification, and/or optimization.

Keywords: COVID-19; Mortality prediction; Multi-center; Multi-modal.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
The proposed multi-modal models for mortality prediction. The extracted EHR data were first preprocessed and then used to train the EHR-based model. For the CXR-based model, an anatomical bounding box extraction pipeline was used to automatically extract the coordinates for the left lung, right lung, mediastinum, and trachea anatomies from each of the CXR images. The CXR images with augmentation were then used to train the CXR-based model. The probability computed from the CXR-based model along with EHR data were used to train the proposed EHR-CXR fusion model, by which the final prediction was generated. The predictions from the EHR- and CXR-based models were also generated for the comparison
Fig. 2
Fig. 2
A random sample of images shown to teach the model where at least 1–2 positive mortality (expired) cases are shown to the model in each batch
Fig. 3
Fig. 3
Model performance using EHR-based model, CXR-based model, and fusion model (EHR + CXR). (A) Internal validation on Madrid dataset; (B) external testing on Hoboken dataset; and (C) external testing on Seoul dataset
Fig. 4
Fig. 4
Feature importance of the EHR-based model revealed by a SHAP plot. Features on the y-axis are ranked by their mean absolute SHAP values and each point represents a patient
Fig. 5
Fig. 5
Feature importance of the fusion model revealed by a SHAP plot. Features on the y-axis are ranked by their mean absolute SHAP values and each point represents a patient
Fig. 6
Fig. 6
Explainability: heatmaps using Grad-CAM algorithm shows that the model primarily uses imaging features from the lungs and mediastinum region for mortality prediction. The image was produced by averaging the heatmaps from the expired patients with prediction probability larger than 0·6 and overlaying it on an actual CXR so it is easier to highlight the physiologic area

References

    1. M. Xu et al., “Accurately Differentiating COVID-19, Other Viral Infection, and Healthy Individuals Using Multimodal Features via Late Fusion Learning,” medRxiv, p. 2020.08.18.20176776, Aug. 2020, 10.1101/2020.08.18.20176776.
    1. G. Chassagnon and N. Paragios, “Holistic AI-Driven Quantification, Staging and Prognosis of COVID-19 Pneumonia,” medRxiv, p. 2020.04.17.20069187, Jul. 2020, 10.1101/2020.04.17.20069187.
    1. X. Wang et al., “Multicenter Study of Temporal Changes and Prognostic Value of a CT Visual Severity Score in Hospitalized Patients With Coronavirus Disease (COVID-19),” Am. J. Roentgenol., pp. 1–10, Sep. 2020, 10.2214/AJR.20.24044. - PubMed
    1. T. Ramtohul et al., “Quantitative CT Extent of Lung Damage in COVID-19 Pneumonia Is an Independent Risk Factor for Inpatient Mortality in a Population of Cancer Patients: A Prospective Study,” Front. Oncol., vol. 10, Sep. 2020, 10.3389/fonc.2020.01560. - PMC - PubMed
    1. N. Lassau et al., “Integration of clinical characteristics, lab tests and a deep learning CT scan analysis to predict severity of hospitalized COVID-19 patients,” medRxiv, p. 2020.05.14.20101972, Oct. 2020, 10.1101/2020.05.14.20101972.

Publication types