Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 13;8(1):435.
doi: 10.1038/s41746-025-01849-y.

Diagnosing pathologic myopia by identifying morphologic patterns using ultra widefield images with deep learning

Affiliations

Diagnosing pathologic myopia by identifying morphologic patterns using ultra widefield images with deep learning

Yang Liu et al. NPJ Digit Med. .

Abstract

Pathologic myopia is a leading cause of visual impairment and blindness. While deep learning-based approaches aid in recognizing pathologic myopia using color fundus photography, they often rely on implicit patterns that lack clinical interpretability. This study aims to diagnose pathologic myopia by identifying clinically significant morphologic patterns, specifically posterior staphyloma and myopic maculopathy, by leveraging ultra-widefield (UWF) images that provide a broad retinal field of view. We curate a large-scale, multi-source UWF myopia dataset called PSMM and introduce RealMNet, an end-to-end lightweight framework designed to identify these challenging patterns. Benefiting from the fast pretraining distillation backbone, RealMNet comprises only 21 million parameters, which facilitates deployment for medical devices. Extensive experiments conducted across three different protocols demonstrate the robustness and generalizability of RealMNet. RealMNet achieves an F1 Score of 0.7970 (95% CI 0.7612-0.8328), mAP of 0.8497 (95% CI 0.8058-0.8937), and AUROC of 0.9745 (95% CI 0.9690-0.9801), showcasing promise in clinical applications.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. General overview of the study.
a Data machining: data are collected from one main center and four auxiliary centers. After double-checking labeling, quality filtering, and essential processing, a stratified partition is implemented to ensure that the distribution of lesions remains similar across sets. Resampling and augmentation techniques are then used to alleviate label imbalance. b Model training and inference: the pretraining-distilled small parametric model is task-specifically fine-tuned with asymmetric focusing and classifier adaptation, which complementarily mitigate label imbalance. c Experimental protocols: three protocols are designed to demonstrate precise inference, robustness, and generalizability of the proposed method. All experiments are implemented by bootstrapping the testing set 1000 times. d Reasoned workflow: model efficiencies of dataset labeling, training parameters, regularization techniques, and focusing regions are extensively examined. Visualizations of gradient-weighted class activation mapping are provided for intuitive interpretations. e Model development and assessment: models are progressively developed through strategy determination, and their performance is assessed on a unified deployment platform using all-around evaluation metrics.
Fig. 2
Fig. 2. Statistics and complications associated with lesions of posterior staphyloma and myopic maculopathy.
a Statistical analysis of the seven categories in the PSMM dataset and its subsets, with specific values assigned to the minimum two categories of each dataset. b Illustrations of complications arising from posterior staphyloma and myopic maculopathy. Sankey diagrams are plotted to illustrate the distribution of these complications in the PSMM dataset and its subsets.
Fig. 3
Fig. 3. Model performance under the centralized inference protocol.
a The proposed models are compared to four well-known benchmarks: DeiT, ConvNeXt, EfficientNet, and Swin Transformer. b The proposed models are compared to two recent foundation models: DINOv2 and VisionFM. The error bars represent the 95% confidence interval of the estimates, and the bar center represents the mean estimate of the displayed metric. The estimates are computed by generating a bootstrap distribution with 1000 bootstrap samples for corresponding testing sets with n = 1000 samples. All P-values are computed with a two-sided t-test between RealMNet-Max and the most competitive comparison model to determine if there are statistically significant differences.
Fig. 4
Fig. 4. Model performance under the main-source robustness protocol and cyclic-source generalizability protocol.
a Assessing model robustness by training on the main source subset and testing on four auxiliary source subsets under the main-source robustness protocol. b Assessing model generalizability by training on the main source subset combined with three of the four auxiliary source subsets and testing on the remaining subset under the cyclic-source generalizability protocol. The error bars represent the 95% confidence interval of the estimates, and the bar center represents the mean estimate of the displayed metric. The estimates are computed by generating a bootstrap distribution with 1000 bootstrap samples for corresponding testing sets with n=1000 samples.
Fig. 5
Fig. 5. Efficiency of RealMNet in identifying posterior staphyloma and myopic maculopathy on the PSMM dataset.
a Labeling efficiency: we progressively increase the amount of training data and labels to achieve precise and stable performance. b Parameter efficiency: we freeze training parameters from different stages to observe the contribution of each stage. c Augmentation efficiency: we ablate two types of augmentation techniques, namely simulated augmentation (S-Augmentation) and batch-wise augmentation (B-Augmentation), to observe the performance gains that RealMNet gets as a result of these techniques. The error bars represent 95% CI of the estimates, and the bar center represents the mean estimate of the displayed metric. The estimates are computed by generating a bootstrap distribution with 1000 bootstrap samples for corresponding testing sets with n=1000 samples. All P values are computed with a two-sided t-test between the original model and its variants to determine if there are statistically significant differences. The bars marked with `ns' are not significant. The asterisks indicate statistically significant differences: *P < 0.05; **P < 0.01; ***P < 0.001.
Fig. 6
Fig. 6. Investigating the superiority of UWF modality.
a Comparing the performance of models trained on UWF images with and without boundary segmentation. b Comparing the performance of models trained on data with original UWF and fake CFP images. The error bars represent the 95% confidence interval of the estimates, and the bar center represents the mean estimate of the displayed metric. The estimates are computed by generating a bootstrap distribution with 1000 bootstrap samples for corresponding testing sets with n=1000 samples. All P values are computed with a two-sided t-test between two comparison models to determine if there are statistically significant differences.
Fig. 7
Fig. 7. We generated visualizations using an improved version of gradient-weighted class activation mapping.
These visualizations show the qualitative predictions of RealMNet for presence of posterior staphyloma (NoPS or PS) and myopic maculopathy with five categories: no myopic retinal lesions (NoMRL), tessellated fundus only (TFO), diffuse chorioretinal atrophy (DCA), patchy chorioretinal atrophy (PCA), and macular atrophy (MA). By merging the heatmaps with the original images, we highlighted irregular attentive regions that correspond to diverse morphologic patterns found in different lesion categories when the model made decisions. These heatmaps provided a qualitative reference for clinicians when making further diagnoses.
Fig. 8
Fig. 8. Identifying complicated peripheral lesions.
a Concurrent distribution of peripheral retinal lesions: no peripheral lesion (NoPL), lattice degeneration or cystic retinal tuft (LDoCRT), holes or tears (HoT), rhegmatogenous retinal detachment (RRD), and postoperative cases (PC). Peripheral lesions may have different concurrent relationships with each other, or they may occur separately. b Model performance on peripheral lesion identification. The blue facecolor represents the mean of the results, and the green outer and red inner boundaries represent the upper and lower bounds of the 95% confidence interval, respectively. All radar plots display class-wise performance on specific metrics, with the last radar plot representing the average performance on all evaluated metrics.

References

    1. Baird, P. N. et al. Myopia. Nat. Rev. Dis. Prim.6, 99 (2020). - PubMed
    1. Dolgin, E. A myopia epidemic is sweeping the globe. here’s how to stop it. Nature629, 989–991 (2024). - PubMed
    1. Morgan, I. G., Ohno-Matsui, K. & Saw, S.-M. Myopia. Lancet379, 1739–1748 (2012). - PubMed
    1. Choudhry, N., Golding, J., Manry, M. W. & Rao, R. C. Ultra-widefield steering-based spectral-domain optical coherence tomography imaging of the retinal periphery. Ophthalmology123, 1368–1374 (2016). - PMC - PubMed
    1. Burlina, P. M. et al. Automated grading of age-related macular degeneration from color fundus images using deep convolutional neural networks. JAMA Ophthalmol.135, 1170–1176 (2017). - PMC - PubMed

LinkOut - more resources