Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan:5:1-11.
doi: 10.1200/CCI.20.00060.

Cloud-Based Federated Learning Implementation Across Medical Centers

Affiliations

Cloud-Based Federated Learning Implementation Across Medical Centers

Suraj Rajendran et al. JCO Clin Cancer Inform. 2021 Jan.

Abstract

Purpose: Building well-performing machine learning (ML) models in health care has always been exigent because of the data-sharing concerns, yet ML approaches often require larger training samples than is afforded by one institution. This paper explores several federated learning implementations by applying them in both a simulated environment and an actual implementation using electronic health record data from two academic medical centers on a Microsoft Azure Cloud Databricks platform.

Materials and methods: Using two separate cloud tenants, ML models were created, trained, and exchanged from one institution to another via a GitHub repository. Federated learning processes were applied to both artificial neural networks (ANNs) and logistic regression (LR) models on the horizontal data sets that are varying in count and availability. Incremental and cyclic federated learning models have been tested in simulation and real environments.

Results: The cyclically trained ANN showed a 3% increase in performance, a significant improvement across most attempts (P < .05). Single weight neural network models showed improvement in some cases. However, LR models did not show much improvement after federated learning processes. The specific process that improved the performance differed based on the ML model and how federated learning was implemented. Moreover, we have confirmed that the order of the institutions during the training did influence the overall performance increase.

Conclusion: Unlike previous studies, our work has shown the implementation and effectiveness of federated learning processes beyond simulation. Additionally, we have identified different federated learning models that have achieved statistically significant performances. More work is needed to achieve effective federated learning processes in biomedicine, while preserving the security and privacy of the data.

PubMed Disclaimer

Figures

FIG 1.
FIG 1.
ROC curves corresponding to performance metrics in tables. (A) ROC curve based on ANN models' performances against institution 1 test data. (B) ROC curve based on ANN models' performances against institution 2 test data. (C) ROC curve based on LR models' performances against institution 1 test data. (D) ROC curve based on LR models' performances against institution 2 test data. ANN, artificial neural network; LR, logistic regression; ROC, receiver operating characteristic.
FIG 2.
FIG 2.
ROC curves corresponding to performance metrics in tables. (A) ROC curve based on ANN models' performances against WF's test data. (B) ROC curve based on ANN models' performances against the MUSC's test data. (C) ROC curve based on LR models' performances against WF's test data. (D) ROC curve based on LR models' performances against the MUSC's test data. ANN, artificial neural network; LR, logistic regression; MUSC, Medical University of South Carolina; ROC, receiver operating characteristic; WF, Wake Forest.
FIG A1.
FIG A1.
Single weight training mechanism.
FIG A2.
FIG A2.
Cyclical weight training mechanism.
FIG A3.
FIG A3.
Distributions of each dataset across classes.
FIG A4.
FIG A4.
The workflow of the federated learning environment in databricks.

References

    1. Fralick M, Colak E, Mamdani M: Machine learning in medicine. N Engl J Med 2019;380:2588-2589 - PubMed
    1. Lee J Yoon W Kim S, et al. : BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020;36:1234-1240 - PMC - PubMed
    1. Obermeyer Z, Emanuel EJ: Predicting the future-big data, machine learning, and clinical medicine. N Engl J Med 2016;375:1216-1219 - PMC - PubMed
    1. Oleynik M Kugic A Kasáč Z, et al. : Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification. J Am Med Inform Assoc 2019;26:1247-1254 - PMC - PubMed
    1. Buda M, Maki A, Mazurowski MA: A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw 2018;106:249-259 - PubMed

Publication types

LinkOut - more resources