Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 13;3(1):100222.
doi: 10.1016/j.xops.2022.100222. eCollection 2023 Mar.

Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models

Affiliations

Visual Field Prediction: Evaluating the Clinical Relevance of Deep Learning Models

Mohammad Eslami et al. Ophthalmol Sci. .

Abstract

Purpose: Two novel deep learning methods using a convolutional neural network (CNN) and a recurrent neural network (RNN) have recently been developed to forecast future visual fields (VFs). Although the original evaluations of these models focused on overall accuracy, it was not assessed whether they can accurately identify patients with progressive glaucomatous vision loss to aid clinicians in preventing further decline. We evaluated these 2 prediction models for potential biases in overestimating or underestimating VF changes over time.

Design: Retrospective observational cohort study.

Participants: All available and reliable Swedish Interactive Thresholding Algorithm Standard 24-2 VFs from Massachusetts Eye and Ear Glaucoma Service collected between 1999 and 2020 were extracted. Because of the methods' respective needs, the CNN data set included 54 373 samples from 7472 patients, and the RNN data set included 24 430 samples from 1809 patients.

Methods: The CNN and RNN methods were reimplemented. A fivefold cross-validation procedure was performed on each model, and pointwise mean absolute error (PMAE) was used to measure prediction accuracy. Test data were stratified into categories based on the severity of VF progression to investigate the models' performances on predicting worsening cases. The models were additionally compared with a no-change model that uses the baseline VF (for the CNN) and the last-observed VF (for the RNN) for its prediction.

Main outcome measures: PMAE in predictions.

Results: The overall PMAE 95% confidence intervals were 2.21 to 2.24 decibels (dB) for the CNN and 2.56 to 2.61 dB for the RNN, which were close to the original studies' reported values. However, both models exhibited large errors in identifying patients with worsening VFs and often failed to outperform the no-change model. Pointwise mean absolute error values were higher in patients with greater changes in mean sensitivity (for the CNN) and mean total deviation (for the RNN) between baseline and follow-up VFs.

Conclusions: Although our evaluation confirms the low overall PMAEs reported in the original studies, our findings also reveal that both models severely underpredict worsening of VF loss. Because the accurate detection and projection of glaucomatous VF decline is crucial in ophthalmic clinical practice, we recommend that this consideration is explicitly taken into account when developing and evaluating future deep learning models.

Keywords: Artificial intelligence; CI, confidence interval; CNN, convolutional neural network; DL, deep learning; Deep learning; Glaucoma; MD, mean deviation; MPark, recurrent neural network method from Park et al; MWen, convolutional neural network method from Wen et al; PMAE, pointwise mean absolute error; Prediction; RNN, recurrent neural network; ROP, rate of progression; TD, total deviation; VF, visual field; Visual fields; dB, decibel.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Simplified illustrations of the MWen (A) and MPark (B) methods. LSTM = long short-term memory; MPark = recurrent neural network method from Park et al; MWen = convolutional neural network method from Wen et al; VF = visual field.
Figure 2
Figure 2
Boxplots of pointwise mean absolute error (PMAE) achieved by the models. (A) The accuracy of MWen over all test samples and (C) with respect to prediction time points. (B) The accuracy of MPark over all test samples and (D) with respect to prediction time points. 0.01 < P ≤ 0.05; ∗∗∗∗P ≤ 0.0001. MPark = recurrent neural network method from Park et al; MWen = convolutional neural network method from Wen et al; ns = not significant; ROP = rate of progression.
Figure 3
Figure 3
Boxplots of pointwise mean absolute error (PMAE) achieved by the models, stratified based on the severity of visual field progression. A, The accuracy of MWen over the test set with respect to prediction time points and partitioned based on changes in MD (ΔMD = MDtruthtarget − MDbaseline). B, The accuracy of MPark over the test set and partitioned based on progression analyzed by the methods of Rabiolo et al, Nouri et al, Schell et al, and Aptel et al. ∗∗∗0.0001 < P ≤ 0.001; ∗∗∗∗P ≤ 0.0001. MD = mean deviation; MPark = recurrent neural network method from Park et al; MWen = convolutional neural network method from Wen et al; ns = not significant; ROP = rate of progression.
Figure 4
Figure 4
Eight random representative examples of visual field predictions from MWen. (AD) are more stable samples, and (EH) are worsening samples. MWen = convolutional neural network method from Wen et al; PMAE = pointwise mean absolute error.
Figure 5
Figure 5
Six random representative examples of visual field predictions from MPark. (AC) are more stable samples, and (DF) are worsening samples. MPark = recurrent neural network method from Park et al; PMAE = pointwise mean absolute error.
Figure 6
Figure 6
Distributions of the (A) MWen and (B) MPark training data sets with respect to progression categories (I–VI), and boxplots of pointwise mean absolute error (PMAE) regarding sampling strategies for (C) MWen and (D) MPark. The prediction intervals for MPark are arbitrary and not limited to the 5 prediction time points like MWen. ∗∗∗∗P ≤ 0.0001. MPark = recurrent neural network method from Park et al; MWen = convolutional neural network method from Wen et al.
Figure 7
Figure 7
Scatterplots of prediction results for MWen (left column) and MPark (right column) regarding mean sensitivity (MS) and mean total deviation (MTD) values. (A, B) show the predicted vs. true, targeted values. (C, D) show the error in prediction vs. actual measured change. Although both methods’ predicted values are spread close to y = x (predicted = ground truth) in the top row, the bottom row shows that both methods have significant inaccuracy in forecasting worsening cases; the ideal unbiased prediction model should be near the green dashed line. MPark = recurrent neural network method from Park et al; MWen = convolutional neural network method from Wen et al.

References

    1. GBD 2019 Blindness and Vision Impairment Collaborators. Vision Loss Expert Group of the Global Burden of Disease Study Causes of blindness and vision impairment in 2020 and trends over 30 years, and prevalence of avoidable blindness in relation to VISION 2020: the right to sight: and analysis for the global burden of disease study. Lancet Glob Health. 2021;9:e144–e160. - PMC - PubMed
    1. Camp A.S., Weinreb R.N. Will perimetry be performed to monitor glaucoma in 2025? Ophthalmology. 2017;124:S71–S75. - PubMed
    1. Kim J.H., Rabiolo A., Morales E., et al. Risk factors for fast visual field progression in glaucoma. Am J Ophthalmol. 2019;207:268–278. - PubMed
    1. Vianna J.R., Chauhan B.C. How to detect progression in glaucoma. Prog Brain Res. 2015;221:135–158. - PubMed
    1. Chen A., Nouri-Mahdavi K., Otarola F.J., et al. Models of glaucomatous visual field loss. Invest Ophthalmol Vis Sci. 2014;55:7881–7887. - PubMed

LinkOut - more resources