Multicenter Study

. 2022 Aug;6(8):657-663.

doi: 10.1016/j.oret.2022.02.015. Epub 2022 Mar 14.

Federated Learning for Multicenter Collaboration in Ophthalmology: Improving Classification Performance in Retinopathy of Prematurity

Collaborators, Affiliations

Collaborators

Imaging and Informatics in Retinopathy of Prematurity Consortium Members of the Imaging and Informatics in Retinopathy of Prematurity research consortium are as follows:
Michael F Chiang, Susan Ostmo, Sang Jin Kim, Kemal Sonmez, John Peter Campbell, Robert Schelonka, Aaron Coyner, R V Paul Chan, Karyn Jonas, Bhavana Kolli, Jason Horowitz, Osode Coki, Cheryl-Ann Eccles, Leora Sarna, Anton Orlin, Audina Berrocal, Catherin Negron, Kimberly Denser, Kristi Cumming, Tammy Osentoski, Tammy Check, Mary Zajechowski, Thomas Lee, Aaron Nagiel, Evan Kruger, Kathryn McGovern, Dilshad Contractor, Margaret Havunjian, Charles Simmons, Raghu Murthy, Sharon Galvis, Jerome Rotter, Ida Chen, Xiaohui Li, Kent Taylor, Kaye Roll, Mary Elizabeth Hartnett, Leah Owen, Darius Moshfeghi, Mariana Nunez, Zac Wennber-Smith, Jayashree Kalpathy-Cramer, Deniz Erdogmus, Stratis Ioannidis, Maria Ana Martinez-Castellanos, Samantha Salinas-Longoria, Rafael Romero, Andrea Arriola, Francisco Olguin-Manriquez, Miroslava Meraz-Gutierrez, Carlos M Dulanto-Reinoso, Cristina Montero-Mendoza

Affiliations

¹ Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, Massachusetts; Center for Clinical Data Science, Massachusetts General Hospital and Brigham and Women's Hospital, Boston, Massachusetts.
² Department of Ophthalmology, Oregon Health and Science University, Portland, Oregon.
³ School of Computer Science, University of Lincoln, Lincoln, United Kingdom.
⁴ Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago, Illinois.
⁵ Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, California.
⁶ National Eye Institute, National Institutes of Health, Bethesda, Maryland.
⁷ Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, Massachusetts; Center for Clinical Data Science, Massachusetts General Hospital and Brigham and Women's Hospital, Boston, Massachusetts. Electronic address: JKALPATHY-CRAMER@mgh.harvard.edu.

PMID: 35296449
PMCID: PMC12316477
DOI: 10.1016/j.oret.2022.02.015

Multicenter Study

Federated Learning for Multicenter Collaboration in Ophthalmology: Improving Classification Performance in Retinopathy of Prematurity

Charles Lu et al. Ophthalmol Retina. 2022 Aug.

. 2022 Aug;6(8):657-663.

doi: 10.1016/j.oret.2022.02.015. Epub 2022 Mar 14.

Authors

Collaborators

Imaging and Informatics in Retinopathy of Prematurity Consortium Members of the Imaging and Informatics in Retinopathy of Prematurity research consortium are as follows:
Michael F Chiang, Susan Ostmo, Sang Jin Kim, Kemal Sonmez, John Peter Campbell, Robert Schelonka, Aaron Coyner, R V Paul Chan, Karyn Jonas, Bhavana Kolli, Jason Horowitz, Osode Coki, Cheryl-Ann Eccles, Leora Sarna, Anton Orlin, Audina Berrocal, Catherin Negron, Kimberly Denser, Kristi Cumming, Tammy Osentoski, Tammy Check, Mary Zajechowski, Thomas Lee, Aaron Nagiel, Evan Kruger, Kathryn McGovern, Dilshad Contractor, Margaret Havunjian, Charles Simmons, Raghu Murthy, Sharon Galvis, Jerome Rotter, Ida Chen, Xiaohui Li, Kent Taylor, Kaye Roll, Mary Elizabeth Hartnett, Leah Owen, Darius Moshfeghi, Mariana Nunez, Zac Wennber-Smith, Jayashree Kalpathy-Cramer, Deniz Erdogmus, Stratis Ioannidis, Maria Ana Martinez-Castellanos, Samantha Salinas-Longoria, Rafael Romero, Andrea Arriola, Francisco Olguin-Manriquez, Miroslava Meraz-Gutierrez, Carlos M Dulanto-Reinoso, Cristina Montero-Mendoza

Affiliations

¹ Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, Massachusetts; Center for Clinical Data Science, Massachusetts General Hospital and Brigham and Women's Hospital, Boston, Massachusetts.
² Department of Ophthalmology, Oregon Health and Science University, Portland, Oregon.
³ School of Computer Science, University of Lincoln, Lincoln, United Kingdom.
⁴ Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago, Illinois.
⁵ Center for Biomedical Informatics Research, Stanford University School of Medicine, Stanford, California.
⁶ National Eye Institute, National Institutes of Health, Bethesda, Maryland.
⁷ Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, Massachusetts; Center for Clinical Data Science, Massachusetts General Hospital and Brigham and Women's Hospital, Boston, Massachusetts. Electronic address: JKALPATHY-CRAMER@mgh.harvard.edu.

PMID: 35296449
PMCID: PMC12316477
DOI: 10.1016/j.oret.2022.02.015

Abstract

Objective: To compare the performance of deep learning classifiers for the diagnosis of plus disease in retinopathy of prematurity (ROP) trained using 2 methods for developing models on multi-institutional data sets: centralizing data versus federated learning (FL) in which no data leave each institution.

Design: Evaluation of a diagnostic test or technology.

Subjects: Deep learning models were trained, validated, and tested on 5255 wide-angle retinal images in the neonatal intensive care units of 7 institutions as part of the Imaging and Informatics in ROP study. All images were labeled for the presence of plus, preplus, or no plus disease with a clinical label and a reference standard diagnosis (RSD) determined by 3 image-based ROP graders and the clinical diagnosis.

Methods: We compared the area under the receiver operating characteristic curve (AUROC) for models developed on multi-institutional data, using a central approach initially, followed by FL, and compared locally trained models with both approaches. We compared the model performance (κ) with the label agreement (between clinical and RSD), data set size, and number of plus disease cases in each training cohort using the Spearman correlation coefficient (CC).

Main outcome measures: Model performance using AUROC and linearly weighted κ.

Results: Four settings of experiment were used: FL trained on RSD against central trained on RSD, FL trained on clinical labels against central trained on clinical labels, FL trained on RSD against central trained on clinical labels, and FL trained on clinical labels against central trained on RSD (P = 0.046, P = 0.126, P = 0.224, and P = 0.0173, respectively). Four of the 7 (57%) models trained on local institutional data performed inferiorly to the FL models. The model performance for local models was positively correlated with the label agreement (between clinical and RSD labels, CC = 0.389, P = 0.387), total number of plus cases (CC = 0.759, P = 0.047), and overall training set size (CC = 0.924, P = 0.002).

Conclusions: We found that a trained FL model performs comparably to a centralized model, confirming that FL may provide an effective, more feasible solution for interinstitutional learning. Smaller institutions benefit more from collaboration than larger institutions, showing the potential of FL for addressing disparities in resource access.

Keywords: Deep learning; Epidemiology; Federated learning; Retinopathy of prematurity.

PubMed Disclaimer

Conflict of interest statement

Drs. Campbell, Chan, and Kalpathy-Cramer receive research support from Genentech (San Francisco, CA). Dr. Chiang previously received research support from Genentech.
The i-ROP DL system has been licensed to Boston AI Lab (Boston, MA) by Oregon Health & Science University, Massachusetts General Hospital, Northeastern University, and the University of Illinois, Chicago, which may result in royalties to Drs. Chan, Campbell, Brown, and Kalpathy-Cramer in the future.
Dr. Campbell was a consultant to Boston AI Lab (Boston, MA).
Dr. Chan is on the Scientific Advisory Board for Phoenix Technology Group (Pleasanton, CA), a consultant for Alcon (Ft Worth, TX).
Dr. Chiang was previously a consultant for Novartis (Basel, Switzerland), and was previously an equity owner of InTeleretina, LLC (Honolulu, HI).
Drs. Chan and Campbell are equity owners of Siloam Vision

Figures

**Figure 1:. Federated learning training schema.**
During each round of training, the global model is synced to all institutions locally (left). Then, each institution trains for a fixed number of epochs (center) before the local model weights are aggregated and averaged in the central server to update the global model (right).

**Figure 2.. Performance of federated and centrally trained models by ground truth.**
The AUROC was high for all four approaches, but models trained using a central approach slightly outperformed models trained using a federated learning approach, as did models trained using reference standard diagnosis (RSD) labels compared with clinical labels (when evaluated against RSD ground truth).

**Figure 3.. Comparative performance of single institution versus multi-institutional models.**
We compared the area under the receiver operating characteristic (AUROC) performance for locally trained models using clinical labels on the average of RSD test sets. 4 / 7 (57%) of locally trained models (in blue) performed inferiorly to both central and federated learning models on data from their own institution labeled with a reference standard label.

**Figure 4.. Relationship between label agreement, training dataset size, disease prevalence, and performance.**
There was a significant correlation between clinical vs. reference standard diagnosis (RSD) label agreement and average kappa performance of the model versus a RSD (Pearson coefficient 0.389 [p=0.387]). Pearson’s correlation between number of plus cases in the training set and kappa performance was 0.759 with p=0.047) while the correlation between overall training set size and kappa performance was 0.924 with p=0.002).

See this image and copyright information in PMC

Comment in

Federated Learning in Ophthalmology: Retinopathy of Prematurity.
Teo ZL, Ting DSW. Teo ZL, et al. Ophthalmol Retina. 2022 Aug;6(8):647-649. doi: 10.1016/j.oret.2022.03.019. Ophthalmol Retina. 2022. PMID: 35933119 No abstract available.

References

1. Lin T-Y, Maire M, Belongie S, et al. Microsoft COCO: Common Objects in Context. Computer Vision – ECCV 2014. 2014:740–755. Available at: 10.1007/978-3-319-10602-1_48. - DOI
1. Gulshan V, Peng L, Coram M, et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA 2016;316:2402–2410. - PubMed
1. Dunnmon JA, Yi D, Langlotz CP, et al. Assessment of Convolutional Neural Networks for Automated Classification of Chest Radiographs. Radiology 2019;290:537–544. - PMC - PubMed
1. Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition 2009. Available at: 10.1109/cvprw.2009.5206848. - DOI
1. Choi RY, Coyner AS, Kalpathy-Cramer J, et al. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl Vis Sci Technol 2020;9:14. - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Elsevier Science
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Federated Learning for Multicenter Collaboration in Ophthalmology: Improving Classification Performance in Retinopathy of Prematurity

Collaborators

Affiliations

Federated Learning for Multicenter Collaboration in Ophthalmology: Improving Classification Performance in Retinopathy of Prematurity

Authors

Collaborators

Affiliations

Abstract

Conflict of interest statement

Figures

Comment in

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources