. 2025 Jan 22;4(1):6.

doi: 10.1038/s44172-025-00341-5.

Distributed training of foundation models for ophthalmic diagnosis

Sina Gholami¹, Fatema-E Jannat¹, Atalie Carina Thompson², Sally Shin Yee Ong², Jennifer I Lim³, Theodore Leng⁴, Hamed Tabkhivayghan¹, Minhaj Nur Alam⁵

Affiliations

¹ Department of Electrical and Computer Engineering, University of North Carolina at Charlotte, Charlotte, NC, USA.
² Department of Ophthalmology, Wake Forest School of Medicine, Winston-Salem, NC, USA.
³ Department of Ophthalmology and Visual Science, University of ILlinois at Chicago, Chicago, IL, USA.
⁴ Byers Eye Institute at Stanford, Stanford University School of Medicine, Stanford, CA, USA.
⁵ Department of Electrical and Computer Engineering, University of North Carolina at Charlotte, Charlotte, NC, USA. malam8@charlotte.edu.

PMID: 39843622
PMCID: PMC11754456
DOI: 10.1038/s44172-025-00341-5

Distributed training of foundation models for ophthalmic diagnosis

Sina Gholami et al. Commun Eng. 2025.

. 2025 Jan 22;4(1):6.

doi: 10.1038/s44172-025-00341-5.

Authors

Sina Gholami¹, Fatema-E Jannat¹, Atalie Carina Thompson², Sally Shin Yee Ong², Jennifer I Lim³, Theodore Leng⁴, Hamed Tabkhivayghan¹, Minhaj Nur Alam⁵

Affiliations

¹ Department of Electrical and Computer Engineering, University of North Carolina at Charlotte, Charlotte, NC, USA.
² Department of Ophthalmology, Wake Forest School of Medicine, Winston-Salem, NC, USA.
³ Department of Ophthalmology and Visual Science, University of ILlinois at Chicago, Chicago, IL, USA.
⁴ Byers Eye Institute at Stanford, Stanford University School of Medicine, Stanford, CA, USA.
⁵ Department of Electrical and Computer Engineering, University of North Carolina at Charlotte, Charlotte, NC, USA. malam8@charlotte.edu.

PMID: 39843622
PMCID: PMC11754456
DOI: 10.1038/s44172-025-00341-5

Abstract

Vision impairment affects nearly 2.2 billion people globally, and nearly half of these cases could be prevented with early diagnosis and intervention-underscoring the urgent need for reliable and scalable detection methods for conditions like diabetic retinopathy and age-related macular degeneration. Here we propose a distributed deep learning framework that integrates self-supervised and domain-adaptive federated learning to enhance the detection of eye diseases from optical coherence tomography images. We employed a self-supervised, mask-based pre-training strategy to develop a robust foundation encoder. This encoder was trained on seven optical coherence tomography datasets, and we compared its performance under local, centralized, and federated learning settings. Our results show that self-supervised methods-both centralized and federated-improved the area under the curve by at least 10% compared to local models. Additionally, incorporating domain adaptation into the federated learning framework further boosted performance and generalization across different populations and imaging conditions. This approach supports collaborative model development without data sharing, providing a scalable, privacy-preserving solution for effective retinal disease screening and diagnosis in diverse clinical settings.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests. Ethics: UIC dataset (DS7) was approved by the institutional review board of the University of Illinois at Chicago and complied with the ethical standards stated in the Declaration of Helsinki.

Figures

**Fig. 1. Overview of the four phases of our framework.**
a Local learning phase in which a baseline model is trained in a particular dataset and evaluated over test set(s). b Centralized learning approach, comprising pre-training, fine-tuning, and evaluation. c Federated learning (FDL) approach where the pre-training phase is conducted via FDL. d domain adaptation (DAD)-FDL pipeline, where the DAD configuration is distributed before pre-training.

**Fig. 2. DL pipelines of the local learning and pre-training.**
a Local learning pipeline, in which a pair of images and their label are input to the model after undergoing four transformations: rotation, color jittering, Gaussian blur, and Sobel filter. b Pre-training phase, during which the input image is masked and given to the reconstruction network to train the encoder over time.

**Fig. 3. Reconstructed images from the MIM network, where 50% of the images are masked.**
a₁–f₁ Columns of original image samples from DS1 to DS6, respectively. a₂–f₁ Column of masked images. a₃–f₃ Columns of reconstructed images.

**Fig. 4. Main stages of pre-training and fine-tuning via federated learning (FDL) at the University of North Carolina at Charlotte (UNCC) and the University of Illinois Chicago (UIC).**
a Server shares the initial model parameter and the configuration to all nodes. b Each FDL node pre-trains its model and sends the model’s weights to the server. c Finally, the server aggregates the weights and reciprocates them to each client, and they start the fine-tuning step.

**Fig. 5. Samples from DS1 to DS7 and their unsupervised noise-transformed version, where all the transformed images have similar pixel value intensity, are shown in rows d to i in alphabetical order.**
a₁–g₁ Columns of original images from DS1 to DS7, respectively. a₂–g₂ Columns of transformed images. h Target image used to transform other images based on its pixel value intensity.

**Fig. 6. Macro AUC-ROC plot of four models over DS7.**
Local, centralized, DAD-FDL-1, and DAD-FDL-5, with DAD-FDL-5 outperforming other methods on DS7.

**Fig. 7. Eigen Grad-CAM inference from the norm layer of the final Swin Transformer BlockV2.**
Models’ inference samples (choroidal neovasculature (CNV), diabetic macular edema (DME), diabetic retinopathy (DR), Drusen, normal and age-related macular degeneration (AMD)) from DS1 to DS7 datasets, respectively. Checkmarks and crosses indicate correct and incorrect predictions.

See this image and copyright information in PMC

Cited by

Compact Vision-Language Models Enable Efficient and Interpretable Automated OCT Analysis Through Layer Specific Multimodal Learning.
Haghighi T, Gholami S, Sokol JT, Lim JI, Leng T, Thompson AC, Tabkhi H, Alam MN. Haghighi T, et al. bioRxiv [Preprint]. 2025 Aug 11:2025.08.07.669187. doi: 10.1101/2025.08.07.669187. bioRxiv. 2025. PMID: 40832232 Free PMC article. Preprint.

References

1. Akpek, E. K. & Smith, R. A. Overview of age-related ocular conditions. Am. J. Manag Care19, S67–75 (2013). - PubMed
1. Bressler, N. M. Age-related macular degeneration is the leading cause of blindness. JAMA291, 1900–1901 (2004). - PubMed
1. Steinmetz, J. D. et al. Causes of blindness and vision impairment in 2020 and trends over 30 years, and prevalence of avoidable blindness in relation to vision 2020: the right to sight: an analysis for the global burden of disease study. Lancet Glob. Health9, e144–e160 (2021). - PMC - PubMed
1. Wang, Y. et al. Global incidence, progression, and risk factors of age-related macular degeneration and projection of disease statistics in 30 years: a modeling study. Gerontology68, 721–735 (2022). - PubMed
1. Doroudian, S. Collaboration in immersive environments: challenges and solutions. Preprint at https://arxiv.org/abs/2311.00689 (2023).

Grants and funding

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Distributed training of foundation models for ophthalmic diagnosis

Affiliations

Distributed training of foundation models for ophthalmic diagnosis

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources