Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 19;12(1):8380.
doi: 10.1038/s41598-022-12497-7.

Leveraging clinical data across healthcare institutions for continual learning of predictive risk models

Affiliations

Leveraging clinical data across healthcare institutions for continual learning of predictive risk models

Fatemeh Amrollahi et al. Sci Rep. .

Abstract

The inherent flexibility of machine learning-based clinical predictive models to learn from episodes of patient care at a new institution (site-specific training) comes at the cost of performance degradation when applied to external patient cohorts. To exploit the full potential of cross-institutional clinical big data, machine learning systems must gain the ability to transfer their knowledge across institutional boundaries and learn from new episodes of patient care without forgetting previously learned patterns. In this work, we developed a privacy-preserving learning algorithm named WUPERR (Weight Uncertainty Propagation and Episodic Representation Replay) and validated the algorithm in the context of early prediction of sepsis using data from over 104,000 patients across four distinct healthcare systems. We tested the hypothesis, that the proposed continual learning algorithm can maintain higher predictive performance than competing methods on previous cohorts once it has been trained on a new patient cohort. In the sepsis prediction task, after incremental training of a deep learning model across four hospital systems (namely hospitals H-A, H-B, H-C, and H-D), WUPERR maintained the highest positive predictive value across the first three hospitals compared to a baseline transfer learning approach (H-A: 39.27% vs. 31.27%, H-B: 25.34% vs. 22.34%, H-C: 30.33% vs. 28.33%). The proposed approach has the potential to construct more generalizable models that can learn from cross-institutional clinical big data in a privacy-preserving manner.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Schematic diagram of the WUPERR algorithm. The training starts with a randomly initialized set of weights, which are trained on the first task (e.g., prediction on Hospital-A data). In all subsequent learning tasks the input layer weights (W1A) are kept frozen. The optimal network parameters, the parameter uncertainties under task-A, and the set of representations from training cohort of Hospital-A ({h1A}) are then transferred to Hospital-B. The deeper layers of the model are fine-tuned to perform the second task (e.g., prediction on Hospital-B data) through replaying the representation of Hospital-A and Hospital-B data. Similarly, the optimal parameters and their uncertainty levels along with the Hospital-A and Hospital-B representations are transferred to Hospital-C to fine-tune the model on performing the third task. Note, at no time protected health information (PHI+) leaves the institutional boundaries of a given hospital. Finally, at the time of evaluation (on testing data) at a given task, the model is evaluated on all the hospital cohorts.
Figure 2
Figure 2
Evaluation of continual learning models for early predicting of onset of Sepsis, measured using Area Under the Curve (AUC) metric. (a) Illustrates AUC of a model (median[IQR]) trained using transfer learning. The model performance is reported (using different markers; see legend) across all the cohorts after sequential training on data from a given hospital on the x-axis. (b) shows the AUC of the proposed WUPERR model, under the same experimental set-up as (a). At the time of evaluation (on testing data) at a given site, the model is evaluated on all the hospital cohorts. The solid line-style indicates that at the time of model evaluation (on testing data) at a given site, the model had already seen the training data from that site. For instance, since the model is first trained on Hospital-A data, the performance of the model on this dataset after continual learning on all subsequent hospitals is shown in solid line-style to signify that the model had already seen this patient cohort in the past. (c) summarizes the model performance (median[IQR]) on Hospitals A–C after continual learning on all four hospitals with Transfer learning (red) and WUPERR (blue).
Figure 3
Figure 3
Evaluation of continual learning models for early predicting of onset of Sepsis, measured using positive predictive value (PPV) and sensitivity. (Atlanta) Illustrates the PPV of a model (median[IQR]) trained using transfer learning (measured at fixed threshold of 0.41 corresponding to 80% sensitivity at Hospital-A after Task 1, for all folds and across all tasks). The model performance is reported (using different markers; see legend) across all the cohorts after sequential training on data from a given hospital on the x-axis. (Atlanta) shows the PPV of the proposed WUPERR model, under the same experimental set-up as (Atlanta). (Atlanta) summarizes the model performance (median[IQR]) on Hospitals A-C after continual learning on all four hospitals with Transfer learning (red) and WUPERR (blue). (df) summarize the model sensitivity results under the same experimental protocol.

Similar articles

Cited by

References

    1. Yu K-H, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat. Biomed. Eng. 2018;2:719–731. doi: 10.1038/s41551-018-0305-z. - DOI - PubMed
    1. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 2019;25:44–56. doi: 10.1038/s41591-018-0300-7. - DOI - PubMed
    1. Lee CS, Lee AY. Clinical applications of continual learning machine learning. Lancet Digit. Health. 2020;2:e279–e281. doi: 10.1016/S2589-7500(20)30102-3. - DOI - PMC - PubMed
    1. Tyler NS, et al. An artificial intelligence decision support system for the management of type 1 diabetes. Nat. Metab. 2020;2:612–619. doi: 10.1038/s42255-020-0212-y. - DOI - PMC - PubMed
    1. Zhou Y, Wang F, Tang J, Nussinov R, Cheng F. Artificial intelligence in COVID-19 drug repurposing. Lancet Digit. Health. 2020 doi: 10.1016/S2589-7500(20)30192-8. - DOI - PMC - PubMed

Publication types