Cloud-Based Federated Learning Implementation Across Medical Centers

Suraj Rajendran^{1

2}, Jihad S Obeid³, Hamidullah Binol⁴, Ralph D Agostino Jr⁵, Kristie Foley⁶, Wei Zhang¹, Philip Austin⁷, Joey Brakefield⁸, Metin N Gurcan⁴, Umit Topaloglu^{1

4

5}

Affiliations

¹ Department of Cancer Biology, Wake Forest University School of Medicine, Winston Salem, NC.
² Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA.
³ Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC.
⁴ Center for Biomedical Informatics, Wake Forest University School of Medicine, Winston Salem, NC.
⁵ Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston Salem, NC.
⁶ Department of Implementation Science, Wake Forest University School of Medicine, Winston Salem, NC.
⁷ Databricks, Boca Raton, FL.
⁸ Microsoft, Atlanta, GA.

PMID: 33411624
PMCID: PMC8140794
DOI: 10.1200/CCI.20.00060

Cloud-Based Federated Learning Implementation Across Medical Centers

Suraj Rajendran et al. JCO Clin Cancer Inform. 2021 Jan.

. 2021 Jan:5:1-11.

doi: 10.1200/CCI.20.00060.

Authors

Suraj Rajendran^{1

2}, Jihad S Obeid³, Hamidullah Binol⁴, Ralph D Agostino Jr⁵, Kristie Foley⁶, Wei Zhang¹, Philip Austin⁷, Joey Brakefield⁸, Metin N Gurcan⁴, Umit Topaloglu^{1

4

5}

Affiliations

¹ Department of Cancer Biology, Wake Forest University School of Medicine, Winston Salem, NC.
² Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA.
³ Department of Public Health Sciences, Medical University of South Carolina, Charleston, SC.
⁴ Center for Biomedical Informatics, Wake Forest University School of Medicine, Winston Salem, NC.
⁵ Department of Biostatistics and Data Science, Wake Forest University School of Medicine, Winston Salem, NC.
⁶ Department of Implementation Science, Wake Forest University School of Medicine, Winston Salem, NC.
⁷ Databricks, Boca Raton, FL.
⁸ Microsoft, Atlanta, GA.

PMID: 33411624
PMCID: PMC8140794
DOI: 10.1200/CCI.20.00060

Abstract

Purpose: Building well-performing machine learning (ML) models in health care has always been exigent because of the data-sharing concerns, yet ML approaches often require larger training samples than is afforded by one institution. This paper explores several federated learning implementations by applying them in both a simulated environment and an actual implementation using electronic health record data from two academic medical centers on a Microsoft Azure Cloud Databricks platform.

Materials and methods: Using two separate cloud tenants, ML models were created, trained, and exchanged from one institution to another via a GitHub repository. Federated learning processes were applied to both artificial neural networks (ANNs) and logistic regression (LR) models on the horizontal data sets that are varying in count and availability. Incremental and cyclic federated learning models have been tested in simulation and real environments.

Results: The cyclically trained ANN showed a 3% increase in performance, a significant improvement across most attempts (P < .05). Single weight neural network models showed improvement in some cases. However, LR models did not show much improvement after federated learning processes. The specific process that improved the performance differed based on the ML model and how federated learning was implemented. Moreover, we have confirmed that the order of the institutions during the training did influence the overall performance increase.

Conclusion: Unlike previous studies, our work has shown the implementation and effectiveness of federated learning processes beyond simulation. Additionally, we have identified different federated learning models that have achieved statistically significant performances. More work is needed to achieve effective federated learning processes in biomedicine, while preserving the security and privacy of the data.

PubMed Disclaimer

Figures

**FIG 1.**
ROC curves corresponding to performance metrics in tables. (A) ROC curve based on ANN models' performances against institution 1 test data. (B) ROC curve based on ANN models' performances against institution 2 test data. (C) ROC curve based on LR models' performances against institution 1 test data. (D) ROC curve based on LR models' performances against institution 2 test data. ANN, artificial neural network; LR, logistic regression; ROC, receiver operating characteristic.

**FIG 2.**
ROC curves corresponding to performance metrics in tables. (A) ROC curve based on ANN models' performances against WF's test data. (B) ROC curve based on ANN models' performances against the MUSC's test data. (C) ROC curve based on LR models' performances against WF's test data. (D) ROC curve based on LR models' performances against the MUSC's test data. ANN, artificial neural network; LR, logistic regression; MUSC, Medical University of South Carolina; ROC, receiver operating characteristic; WF, Wake Forest.

**FIG A1.**
Single weight training mechanism.

**FIG A2.**
Cyclical weight training mechanism.

**FIG A3.**
Distributions of each dataset across classes.

**FIG A4.**
The workflow of the federated learning environment in databricks.

See this image and copyright information in PMC

References

1. Fralick M, Colak E, Mamdani M: Machine learning in medicine. N Engl J Med 2019;380:2588-2589 - PubMed
1. Lee J Yoon W Kim S, et al. : BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020;36:1234-1240 - PMC - PubMed
1. Obermeyer Z, Emanuel EJ: Predicting the future-big data, machine learning, and clinical medicine. N Engl J Med 2016;375:1216-1219 - PMC - PubMed
1. Oleynik M Kugic A Kasáč Z, et al. : Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification. J Am Med Inform Assoc 2019;26:1247-1254 - PMC - PubMed
1. Buda M, Maki A, Mazurowski MA: A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw 2018;106:249-259 - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Cloud-Based Federated Learning Implementation Across Medical Centers

Affiliations

Cloud-Based Federated Learning Implementation Across Medical Centers

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources