Federated Learning for Multicenter Collaboration in Ophthalmology: Improving Classification Performance in Retinopathy of Prematurity
- PMID: 35296449
- PMCID: PMC12316477
- DOI: 10.1016/j.oret.2022.02.015
Federated Learning for Multicenter Collaboration in Ophthalmology: Improving Classification Performance in Retinopathy of Prematurity
Abstract
Objective: To compare the performance of deep learning classifiers for the diagnosis of plus disease in retinopathy of prematurity (ROP) trained using 2 methods for developing models on multi-institutional data sets: centralizing data versus federated learning (FL) in which no data leave each institution.
Design: Evaluation of a diagnostic test or technology.
Subjects: Deep learning models were trained, validated, and tested on 5255 wide-angle retinal images in the neonatal intensive care units of 7 institutions as part of the Imaging and Informatics in ROP study. All images were labeled for the presence of plus, preplus, or no plus disease with a clinical label and a reference standard diagnosis (RSD) determined by 3 image-based ROP graders and the clinical diagnosis.
Methods: We compared the area under the receiver operating characteristic curve (AUROC) for models developed on multi-institutional data, using a central approach initially, followed by FL, and compared locally trained models with both approaches. We compared the model performance (κ) with the label agreement (between clinical and RSD), data set size, and number of plus disease cases in each training cohort using the Spearman correlation coefficient (CC).
Main outcome measures: Model performance using AUROC and linearly weighted κ.
Results: Four settings of experiment were used: FL trained on RSD against central trained on RSD, FL trained on clinical labels against central trained on clinical labels, FL trained on RSD against central trained on clinical labels, and FL trained on clinical labels against central trained on RSD (P = 0.046, P = 0.126, P = 0.224, and P = 0.0173, respectively). Four of the 7 (57%) models trained on local institutional data performed inferiorly to the FL models. The model performance for local models was positively correlated with the label agreement (between clinical and RSD labels, CC = 0.389, P = 0.387), total number of plus cases (CC = 0.759, P = 0.047), and overall training set size (CC = 0.924, P = 0.002).
Conclusions: We found that a trained FL model performs comparably to a centralized model, confirming that FL may provide an effective, more feasible solution for interinstitutional learning. Smaller institutions benefit more from collaboration than larger institutions, showing the potential of FL for addressing disparities in resource access.
Keywords: Deep learning; Epidemiology; Federated learning; Retinopathy of prematurity.
Copyright © 2022 American Academy of Ophthalmology. All rights reserved.
Conflict of interest statement
Drs. Campbell, Chan, and Kalpathy-Cramer receive research support from Genentech (San Francisco, CA). Dr. Chiang previously received research support from Genentech.
The i-ROP DL system has been licensed to Boston AI Lab (Boston, MA) by Oregon Health & Science University, Massachusetts General Hospital, Northeastern University, and the University of Illinois, Chicago, which may result in royalties to Drs. Chan, Campbell, Brown, and Kalpathy-Cramer in the future.
Dr. Campbell was a consultant to Boston AI Lab (Boston, MA).
Dr. Chan is on the Scientific Advisory Board for Phoenix Technology Group (Pleasanton, CA), a consultant for Alcon (Ft Worth, TX).
Dr. Chiang was previously a consultant for Novartis (Basel, Switzerland), and was previously an equity owner of InTeleretina, LLC (Honolulu, HI).
Drs. Chan and Campbell are equity owners of Siloam Vision
Figures




Comment in
-
Federated Learning in Ophthalmology: Retinopathy of Prematurity.Ophthalmol Retina. 2022 Aug;6(8):647-649. doi: 10.1016/j.oret.2022.03.019. Ophthalmol Retina. 2022. PMID: 35933119 No abstract available.
References
-
- Lin T-Y, Maire M, Belongie S, et al. Microsoft COCO: Common Objects in Context. Computer Vision – ECCV 2014. 2014:740–755. Available at: 10.1007/978-3-319-10602-1_48. - DOI
-
- Gulshan V, Peng L, Coram M, et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA 2016;316:2402–2410. - PubMed
-
- Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition 2009. Available at: 10.1109/cvprw.2009.5206848. - DOI
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources