. 2022 Dec;6(12):1370-1383.

doi: 10.1038/s41551-022-00867-5. Epub 2022 Mar 29.

Detection of signs of disease in external photographs of the eyes via deep learning

Boris Babenko^#¹, Akinori Mitani^#^{1

2}, Ilana Traynis³, Naho Kitade¹, Preeti Singh¹, April Y Maa^{4

5}, Jorge Cuadros⁶, Greg S Corrado¹, Lily Peng¹, Dale R Webster¹, Avinash Varadarajan¹, Naama Hammel⁷, Yun Liu⁸

Affiliations

¹ Google Health, Palo Alto, CA, USA.
² Artera, Mountain View, CA, USA.
³ Google Health via Advanced Clinical, Deerfield, IL, USA.
⁴ Department of Ophthalmology, Emory University School of Medicine, Atlanta, GA, USA.
⁵ Regional Telehealth Services, Technology-based Eye Care Services (TECS) Division, Veterans Integrated Service Network (VISN) 7, Decatur, GA, USA.
⁶ EyePACS Inc, Santa Cruz, CA, USA.
⁷ Google Health, Palo Alto, CA, USA. nhammel@google.com.
⁸ Google Health, Palo Alto, CA, USA. liuyun@google.com.

^# Contributed equally.

PMID: 35352000
PMCID: PMC8963675
DOI: 10.1038/s41551-022-00867-5

Detection of signs of disease in external photographs of the eyes via deep learning

Boris Babenko et al. Nat Biomed Eng. 2022 Dec.

. 2022 Dec;6(12):1370-1383.

doi: 10.1038/s41551-022-00867-5. Epub 2022 Mar 29.

Authors

Affiliations

¹ Google Health, Palo Alto, CA, USA.
² Artera, Mountain View, CA, USA.
³ Google Health via Advanced Clinical, Deerfield, IL, USA.
⁴ Department of Ophthalmology, Emory University School of Medicine, Atlanta, GA, USA.
⁵ Regional Telehealth Services, Technology-based Eye Care Services (TECS) Division, Veterans Integrated Service Network (VISN) 7, Decatur, GA, USA.
⁶ EyePACS Inc, Santa Cruz, CA, USA.
⁷ Google Health, Palo Alto, CA, USA. nhammel@google.com.
⁸ Google Health, Palo Alto, CA, USA. liuyun@google.com.

^# Contributed equally.

PMID: 35352000
PMCID: PMC8963675
DOI: 10.1038/s41551-022-00867-5

Abstract

Retinal fundus photographs can be used to detect a range of retinal conditions. Here we show that deep-learning models trained instead on external photographs of the eyes can be used to detect diabetic retinopathy (DR), diabetic macular oedema and poor blood glucose control. We developed the models using eye photographs from 145,832 patients with diabetes from 301 DR screening sites and evaluated the models on four tasks and four validation datasets with a total of 48,644 patients from 198 additional screening sites. For all four tasks, the predictive performance of the deep-learning models was significantly higher than the performance of logistic regression models using self-reported demographic and medical history data, and the predictions generalized to patients with dilated pupils, to patients from a different DR screening programme and to a general eye care programme that included diabetics and non-diabetics. We also explored the use of the deep-learning models for the detection of elevated lipid levels. The utility of external eye photographs for the diagnosis and management of diseases should be further validated with images from different cameras and patient populations.

PubMed Disclaimer

Conflict of interest statement

B.B., A.M., N.K., G.S.C., L.P., D.R.W., A.V., N.H. and Y.L. are employees of Google LLC, own Alphabet stock and are co-inventors on patents (in various stages) for machine learning using medical images. I.T. is a consultant of Google LLC. J.C. is the CEO of EyePACS Inc. A.Y.M. declares no competing interests.

Figures

**Fig. 1. Extracting insights from external photographs of the front of the eye.**
a, Diabetes-related complications can be diagnosed by using specialized cameras to take fundus photographs, which visualize the posterior segment of the eye. By contrast, anterior imaging using a standard consumer-grade camera can reveal conditions affecting the eyelids, conjunctiva, cornea and lens. In this work, we show that external photographs of the eye can offer insights into diabetic retinal disease and detect poor blood sugar control. These images are shown for illustrative purposes only and do not necessarily belong to the same subject. b, Our external eye DLS was developed on data from California (CA, yellow) and evaluated on data from 18 other US states (light blue). c, AUCs for the four validation datasets. The prediction tasks shown are poor sugar control (HbA1c ≥9%), elevated lipids (total cholesterol (Tch) ≥240 mg dl⁻¹, triglycerides (Tg) ≥200 mg dl⁻¹), moderate+ DR, DME, VTDR and a positive control: cataract. Some targets were not available in validation sets C and D. Error bars are 95% CIs computed using the DeLong method. A graphic visualization of receiver operating characteristic (ROC) curves is shown in Extended Data Fig. 3, and Supplementary Table 1 contains numerical values of AUCs along P values for improvement. d, Similar to c but for PPV. The thresholds used for the PPV are based on the 5% of patients with the highest predicted likelihood. Error bars are 95% bootstrap CIs. A graphic visualization of the PPV over multiple thresholds is in Extended Data Fig. 1. ^aBaseline characteristics models for validation sets A and B include self-reported age, sex, race/ethnicity and years with diabetes and were trained on the training dataset. ^bThe baseline characteristics models for validation sets C and D include self-reported age and sex and were trained directly on the respective sets due to large differences in patient population compared with the development set. Thus, these overestimate baseline performance. ^cPre-specified primary prediction tasks.

**Fig. 2. Importance of different regions of the image and impact of image resolution.**
a, Sample images with different parts of the image visible. Top: only a circle centred around the centre is visible. Bottom: only the peripheral rim is visible. b, AUCs as a function of the fraction of input pixels visible when the DLS sees only the centre (blue curve) or peripheral regions (green curve). A large delta between the blue and green curves (for example, for cataract) indicates that most of the information is in the region corresponding to the blue line. c, Sample images of different resolutions. Images were down-sampled to a certain size (for example, ‘5’ indicates 5 × 5 pixels across) and then up-sampled to the same original input size of 587 × 587 pixels to ensure that the network has the exact same capacity (Methods). d, AUCs corresponding to different input image resolutions. Error bars are 95% CIs computed using the DeLong method. ^aPre-specified primary prediction tasks.

**Fig. 3. Qualitative and quantitative saliency analysis illustrating the influence of various regions of the image towards the prediction.**
a, Overview of anterior eye anatomy. b–h, Eye images from patients with HbA1c ≥9% (b), Tch ≥240 mg dl⁻¹ (c), Tg ≥200 mg dl⁻¹ (d), moderate+ DR (e), DME (f), VTDR (g), and cataract (h). Illustrations are the result of averaging images or saliency maps from 100 eyes, with laterality manually adjusted via horizontal flips to have the nasal side on the right (Methods). Left: the original averaged image. Right: a similarly averaged saliency from GradCAM. Saliency maps from other methods are shown in Supplementary Fig. 4a–g. ^aPre-specified primary prediction tasks.

**Extended Data Fig. 1. Curves of positive predictive value (PPV) as a function of threshold for various predictions using external eye images.**
a, poor sugar control (HbA1c ≥ 9%), b-c, elevated lipids (total cholesterol ≥ 240 mg dl^-1 and triglycerides ≥ 200 mg dl^-1), d, moderate-or-worse diabetic retinopathy (DR), e, diabetic macular edema (DME), f, vision-threatening DR (VTDR), and g, a positive control: cataract. In these plots, the x-axis indicates the percentage of patients predicted to be positive; for example 5% means the top 5% based on predicted likelihood was categorized to be “positive”, and the respective curves indicate the PPV for that threshold. The curves are truncated at the extreme end (when only 0.5% of patients are predicted positive, confidence intervals are wide) to reduce noise and improve clarity. Shaded areas indicate 95% bootstrap confidence intervals. Empty panels indicate unavailable data in validation set C and D. (*) Baseline characteristics models for validation sets A and B include self-reported age, sex, race/ethnicity and years with diabetes and were trained on the training dataset. (+) The baseline characteristics models for validation sets C and D use self-reported age and sex and were trained directly on validation sets C and D due to large differences in patient population compared to the development set. ^†: prespecified primary prediction tasks.

**Extended Data Fig. 2. Curves of negative predictive values (NPV) as a function of threshold for various predictions using external eye images.**
a, poor sugar control (HbA1c ≥ 9%), b-c, elevated lipids (total cholesterol ≥ 240 mg dl^-1 and triglycerides ≥ 200 mg dl^-1), d, moderate-or-worse diabetic retinopathy (DR), e, diabetic macular edema (DME), f, vision-threatening DR (VTDR), and g, a positive control: cataract. This is the NPV equivalent of Extended Data Fig. 1. The curves are truncated at the extreme end (when only 1% of patients are predicted negative, confidence intervals are wide) to reduce noise and improve clarity. Shaded areas indicate 95% bootstrap confidence intervals. Empty panels indicate unavailable data in validation set C and D. (*) Baseline characteristics models for validation sets A and B include self-reported age, sex, race/ethnicity and years with diabetes and were trained on the training dataset. (+) The baseline characteristics models for validation sets C and D use self-reported age and sex and were trained directly on validation sets C and D due to large differences in patient population compared to the development set. ^†: prespecified primary prediction tasks.

**Extended Data Fig. 3. Receiver operating characteristic curves (ROCs) for various predictions using external eye images.**
a, poor sugar control (HbA1c ≥ 9%), b-c, elevated lipids (total cholesterol ≥ 240 mg dl^-1 and triglycerides ≥ 200 mg dl^-1), d, moderate-or-worse diabetic retinopathy (DR), e, diabetic macular edema (DME), f, vision-threatening DR (VTDR), and g, a positive control: cataract. Sample sizes (“N” for the number of visits and “n” for the number of positive visits), area under ROC (AUCs), and the p-value for the difference are provided in Supplementary Table 1. Empty panels indicate unavailable data in validation sets C and D. (*) Baseline characteristics models for validation sets A and B include self-reported age, sex, race/ethnicity and years with diabetes and were trained on the training dataset. (+) The baseline characteristics models for validation sets C and D use self-reported age and sex and were trained directly on validation sets C and D due to large differences in patient population compared to the development set. ^†: prespecified primary prediction tasks.

**Extended Data Fig. 4. Saliency analysis illustrating the influence of various regions of the image towards the prediction.**
a-g, Figures are generated in the same manner as in Fig. 3, but with different saliency methods: Integrated Gradients on the left, and guided backpropagation on the right. h-i, Quantifying the pixel intensity in the averaged saliency heatmaps. ^†: prespecified primary prediction tasks.

See this image and copyright information in PMC

Comment in

Eyeing severe diabetes upfront.
Teo ZL, Ting DSW. Teo ZL, et al. Nat Biomed Eng. 2022 Dec;6(12):1321-1322. doi: 10.1038/s41551-022-00879-1. Nat Biomed Eng. 2022. PMID: 35411115 No abstract available.

Cited by

Deep learning for predicting circular retinal nerve fiber layer thickness from fundus photographs and diagnosing glaucoma.
Xu Y, Liu H, Sun R, Wang H, Huo Y, Wang N, Hu M. Xu Y, et al. Heliyon. 2024 Jun 28;10(13):e33813. doi: 10.1016/j.heliyon.2024.e33813. eCollection 2024 Jul 15. Heliyon. 2024. PMID: 39040392 Free PMC article.
Predicting demographics from meibography using deep learning.
Wang J, Graham AD, Yu SX, Lin MC. Wang J, et al. Sci Rep. 2022 Sep 20;12(1):15701. doi: 10.1038/s41598-022-18933-y. Sci Rep. 2022. PMID: 36127431 Free PMC article.
Retinal Imaging-Based Oculomics: Artificial Intelligence as a Tool in the Diagnosis of Cardiovascular and Metabolic Diseases.
Ghenciu LA, Dima M, Stoicescu ER, Iacob R, Boru C, Hațegan OA. Ghenciu LA, et al. Biomedicines. 2024 Sep 23;12(9):2150. doi: 10.3390/biomedicines12092150. Biomedicines. 2024. PMID: 39335664 Free PMC article. Review.
Design and Usability Study of a Point of Care mHealth App for Early Dry Eye Screening and Detection.
Zhang S, Echegoyen J. Zhang S, et al. J Clin Med. 2023 Oct 12;12(20):6479. doi: 10.3390/jcm12206479. J Clin Med. 2023. PMID: 37892616 Free PMC article.
Oculomics: A Crusade Against the Four Horsemen of Chronic Disease.
Patterson EJ, Bounds AD, Wagner SK, Kadri-Langford R, Taylor R, Daly D. Patterson EJ, et al. Ophthalmol Ther. 2024 Jun;13(6):1427-1451. doi: 10.1007/s40123-024-00942-x. Epub 2024 Apr 17. Ophthalmol Ther. 2024. PMID: 38630354 Free PMC article. Review.

See all "Cited by" articles

References

1. Poplin R, et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nat. Biomed. Eng. 2018;2:158–164. doi: 10.1038/s41551-018-0195-0. - DOI - PubMed
1. Cheung, C. Y. et al. A deep-learning system for the assessment of cardiovascular disease risk via the measurement of retinal-vessel calibre. Nat. Biomed. Eng. 10.1038/s41551-020-00626-4 (2020). - PubMed
1. Mitani A, et al. Detection of anaemia from retinal fundus images via deep learning. Nat. Biomed. Eng. 2020;4:18–27. doi: 10.1038/s41551-019-0487-z. - DOI - PubMed
1. Sabanayagam C, et al. A deep-learning algorithm to detect chronic kidney disease from retinal photographs in community-based populations. Lancet Digit. Health. 2020;2:e295–e302. doi: 10.1016/S2589-7500(20)30063-7. - DOI - PubMed
1. Rim TH, et al. Prediction of systemic biomarkers from retinal photographs: development and validation of deep-learning algorithms. Lancet Digit. Health. 2020;2:e526–e536. doi: 10.1016/S2589-7500(20)30216-8. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Detection of signs of disease in external photographs of the eyes via deep learning

Affiliations

Detection of signs of disease in external photographs of the eyes via deep learning

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract

Conflict of interest statement

Figures

Comment in

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical