Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 28;10(1):12598.
doi: 10.1038/s41598-020-69250-1.

Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data

Affiliations

Federated learning in medicine: facilitating multi-institutional collaborations without sharing patient data

Micah J Sheller et al. Sci Rep. .

Abstract

Several studies underscore the potential of deep learning in identifying complex patterns, leading to diagnostic and prognostic biomarkers. Identifying sufficiently large and diverse datasets, required for training, is a significant challenge in medicine and can rarely be found in individual institutions. Multi-institutional collaborations based on centrally-shared patient data face privacy and ownership challenges. Federated learning is a novel paradigm for data-private multi-institutional collaborations, where model-learning leverages all available data without sharing data between institutions, by distributing the model-training to the data-owners and aggregating their results. We show that federated learning among 10 institutions results in models reaching 99% of the model quality achieved with centralized data, and evaluate generalizability on data from institutions outside the federation. We further investigate the effects of data distribution across collaborating institutions on model quality and learning patterns, indicating that increased access to data through data private multi-institutional collaborations can benefit model quality more than the errors introduced by the collaborative method. Finally, we compare with other collaborative-learning approaches demonstrating the superiority of federated learning, and discuss practical implementation considerations. Clinical adoption of federated learning is expected to lead to models trained on datasets of unprecedented size, hence have a catalytic impact towards precision/personalized medicine.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
System architectures of collaborative learning approaches for multi-institutional collaborations. The current paradigm for multi-institutional collaborations, based on Centralized Data Sharing, is shown in (a), whereas in (b) we note the proposed paradigm, based on Federated Learning. Panels (c) and (d) offer schematics for alternative data-private collaborative learning approaches evaluated in this study, namely Institutional Incremental Learning, and Cyclic Institutional Incremental Learning, respectively.
Figure 2
Figure 2
Single Original Institution Validation Results. Single institution mean final model qualities (based on the Dice Similarity Coefficient) for the Original Institution group (y-axis) measured against all single institution held-out validation sets (x-axis) using multiple runs of five-fold collaborative cross validation. The Y axis represents models trained on a single institutional dataset, and the X axis represents the validation dataset of each independent institution (Local Validation Dataset). “AVG” indicates the average of each institution mean model performance over all institutions in the group other than itself, “W-AVG” denotes the same, but with a weighted average according to each institution’s contribution to the validation set size. The diagonal entries indicate how well each institution’s final models scored against their own validation set, and they are represented as the Single Institutional Model (SIM) results reported in Fig. 3.
Figure 3
Figure 3
Model quality results from single institution training, CDS, FL, IIL, and CIIL. CDS, FL, CIIL mean model Dice against the Original Institution group single institution held-out validation data over multiple runs of collaborative cross validation, as well as the average of single institutional results under the same scheme (AVG SIM). The AVG 1–10 column provides the average performance of each collaboration method across single institution validation sets. For CIIL, ‘best local’ and ‘random local’ are two methods we introduce for final model selection during CIIL (More details are given in the “Methods: Final Model Selection” section ). Note that the color scale here differs from that used in Fig. 2.
Figure 4
Figure 4
Learning curves of collaborative learning methods on Original Institution data. Mean global validation Dice every epoch by collaborative learning method on the Original Institution group over multiple runs of collaborative cross validation. Confidence intervals are min, max. An epoch for DCS is defined as a single training pass over all of the centralized data. An epoch for FL is defined as a parallel training pass of every institutiuon over their training data, and an epoch during CIIL and IIL is defined as a single insitution training pass over its data.

References

    1. Zech JR, et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLOS Med. 2018;15:e1002683. doi: 10.1371/journal.pmed.1002683. - DOI - PMC - PubMed
    1. Clark K, et al. The cancer imaging archive (TCIA): maintaining and operating a public information repository. J. Digit. Imaging. 2013;26:1045–1057. doi: 10.1007/s10278-013-9622-7. - DOI - PMC - PubMed
    1. Davatzikos C, et al. AI-based prognostic imaging biomarkers for precision neurooncology: the ReSPOND consortium. Neuro Oncol. 2020 doi: 10.1093/neuonc/noaa045. - DOI - PMC - PubMed
    1. Menze BH, et al. The multimodal brain tumor image segmentation benchmark (BRATS) IEEE Trans. Med. Imaging. 2015;34:1993–2024. doi: 10.1109/TMI.2014.2377694. - DOI - PMC - PubMed
    1. Bakas S, et al. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Nat. Sci. Data. 2017;4:170117. doi: 10.1038/sdata.2017.117. - DOI - PMC - PubMed

Publication types