Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2020 Aug 14:2020.08.11.20172809.
doi: 10.1101/2020.08.11.20172809.

Federated Learning of Electronic Health Records Improves Mortality Prediction in Patients Hospitalized with COVID-19

Affiliations

Federated Learning of Electronic Health Records Improves Mortality Prediction in Patients Hospitalized with COVID-19

Akhil Vaid et al. medRxiv. .

Update in

Abstract

Machine learning (ML) models require large datasets which may be siloed across different healthcare institutions. Using federated learning, a ML technique that avoids locally aggregating raw clinical data across multiple institutions, we predict mortality within seven days in hospitalized COVID-19 patients. Patient data was collected from Electronic Health Records (EHRs) from five hospitals within the Mount Sinai Health System (MSHS). Logistic Regression with L1 regularization (LASSO) and Multilayer Perceptron (MLP) models were trained using local data at each site, a pooled model with combined data from all five sites, and a federated model that only shared parameters with a central aggregator. Both the federated LASSO and federated MLP models performed better than their local model counterparts at four hospitals. The federated MLP model also outperformed the federated LASSO model at all hospitals. Federated learning shows promise in COVID-19 EHR data to develop robust predictive models without compromising patient privacy.

PubMed Disclaimer

Conflict of interest statement

COMPETING INTERESTS

None.

Figures

Figure 1.
Figure 1.. Study Design and Model Workflow.
(A) Criteria for patient inclusion in study. (B) Overview of local and pooled models. Local models only utilize data from the site itself while pooled models incorporate data from all sites. Both local and pooled MLP and LASSO models were utilized. (C) Overview of federated model. Parameters from a central aggregator are shared with each site, and sites do not have direct access to clinical data from others. After models are trained locally at a site, parameters with and without added noise are sent back to the central aggregator to update federated model parameters. A federated LASSO and federated MLP model were utilized.
Figure 2.
Figure 2.. Federated Model Training.
Performance of (A) federated MLP and (B) federated LASSO models as measured by area under the receiver-operating characteristic (AUC-ROC) versus the number of training epochs. Binary Cross-Entropy Loss of (C) Federated MLP and (D) Federated LASSO versus the number of training epochs.
Figure 3.
Figure 3.. Model Performance by Site.
Performance of all models (LASSO local, LASSO pooled, LASSO federated, MLP local, MLP pooled, MLP federated (no noise) by area under the receiver-operating characteristic (AUC-ROC) at (A) Mount Sinai Brooklyn (MSB) (n=611) (B) Mount Sinai West (MSW) (n=485), (C) Mount Sinai Morningside (MSM) (n=749), (D) Mount Sinai Hospital (MSH) (n=1644), and (E) Mount Sinai Queens (MSQ) (n=540). Averages of receiver-operating characteristic after 10-fold cross validation are shown. Average performance of each model across all five sites is presented in (F).

References

    1. Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious Diseases 2020;20 (5):533–34. - PMC - PubMed
    1. Charney AW, Simons NW, Mouskas K, et al. Sampling the host response to SARS-CoV-2 in hospitals under siege. Nature Medicine 2020. - PubMed
    1. Clerkin KJ, Fried JA, Raikhelkar J, et al. COVID-19 and Cardiovascular Disease. Circulation 2020;141 (20):1648–55. - PubMed
    1. Wu Z, McGoogan JM. Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention. JAMA 2020;323 (13):1239–42. - PubMed
    1. Lauer SA, Grantz K, Bi Q, et al. The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application. Annals of Internal Medicine 2020;172 (9):577–82. - PMC - PubMed

Publication types