Artificial intelligence, explainability, and the scientific method: A proof-of-concept study on novel retinal biomarker discovery

Parsa Delavari^{1

2}, Gulcenur Ozturan¹, Lei Yuan¹, Özgür Yilmaz³, Ipek Oruc^{1

2}

Affiliations

¹ Ophthalmology and Visual Sciences, University of British Columbia, Vancouver, V5Z 0A6 BC, Canada.
² Neuroscience, University of British Columbia, Djavad Mowafaghian Centre for Brain Health, Vancouver, V6T 1Z3 BC, Canada.
³ Mathematics, University of British Columbia, Vancouver, V6T 1Z2 BC, Canada.

PMID: 37746328
PMCID: PMC10517742
DOI: 10.1093/pnasnexus/pgad290

Artificial intelligence, explainability, and the scientific method: A proof-of-concept study on novel retinal biomarker discovery

Parsa Delavari et al. PNAS Nexus. 2023.

. 2023 Sep 6;2(9):pgad290.

doi: 10.1093/pnasnexus/pgad290. eCollection 2023 Sep.

Authors

Parsa Delavari^{1

2}, Gulcenur Ozturan¹, Lei Yuan¹, Özgür Yilmaz³, Ipek Oruc^{1

2}

Affiliations

¹ Ophthalmology and Visual Sciences, University of British Columbia, Vancouver, V5Z 0A6 BC, Canada.
² Neuroscience, University of British Columbia, Djavad Mowafaghian Centre for Brain Health, Vancouver, V6T 1Z3 BC, Canada.
³ Mathematics, University of British Columbia, Vancouver, V6T 1Z2 BC, Canada.

PMID: 37746328
PMCID: PMC10517742
DOI: 10.1093/pnasnexus/pgad290

Abstract

We present a structured approach to combine explainability of artificial intelligence (AI) with the scientific method for scientific discovery. We demonstrate the utility of this approach in a proof-of-concept study where we uncover biomarkers from a convolutional neural network (CNN) model trained to classify patient sex in retinal images. This is a trait that is not currently recognized by diagnosticians in retinal images, yet, one successfully classified by CNNs. Our methodology consists of four phases: In Phase 1, CNN development, we train a visual geometry group (VGG) model to recognize patient sex in retinal images. In Phase 2, Inspiration, we review visualizations obtained from post hoc interpretability tools to make observations, and articulate exploratory hypotheses. Here, we listed 14 hypotheses retinal sex differences. In Phase 3, Exploration, we test all exploratory hypotheses on an independent dataset. Out of 14 exploratory hypotheses, nine revealed significant differences. In Phase 4, Verification, we re-tested the nine flagged hypotheses on a new dataset. Five were verified, revealing (i) significantly greater length, (ii) more nodes, and (iii) more branches of retinal vasculature, (iv) greater retinal area covered by the vessels in the superior temporal quadrant, and (v) darker peripapillary region in male eyes. Finally, we trained a group of ophthalmologists ( $N = 26$ ) to recognize the novel retinal features for sex classification. While their pretraining performance was not different from chance level or the performance of a nonexpert group ( $N = 31$ ), after training, their performance increased significantly ( $p < 0.001$ , $d = 2.63$ ). These findings showcase the potential for retinal biomarker discovery through CNN applications, with the added utility of empowering medical practitioners with new diagnostic capabilities to enhance their clinical toolkit.

Keywords: artificial intelligence; convolutional neural networks; medical image perception; retinal biomarkers; retinal fundus images.

PubMed Disclaimer

Figures

**Fig. 1.**
Overview of our methodology.

**Fig. 2.**
A sample fundus image A) along with its binary optic disc mask B), fovea C), and vessel mask D). An illustration of the vessel graph extracted based on binary vessel mask is depicted in panel E). Lines and dots represent the edges and nodes of the obtained vessel graph, respectively.

**Fig. 3.**
Saliency map results of sample fundus images from two male and two female patients. In each panel, the original fundus image, the Guided Grad-CAM output (3-channel image), and its color-coded amplitude (single-channel image) are shown from left to right.

**Fig. 4.**
Feature visualization results for sample fundus images. The top two rows of the middle column show original male images and the bottom two rows show original female images. The left and right columns represent feature visualizations for male and female classes, respectively.

**Fig. 5.**
Accuracy in the 2-AFC sex-recognition task is shown for the pretraining and post-training blocks for the expert ophthalmologist group A) and the nonexpert group B).

**Fig. 6.**
Accuracy in the training block is shown as a function of NOMT performance for the expert ophthalmologist group A) and the nonexpert group B).

See this image and copyright information in PMC

References

1. Baraniuk R, Donoho D, Gavish M. 2020. The science of deep learning. Proc Natl Acad Sci USA. 117(48):30029–30032. - PMC - PubMed
1. Elul Y, Rosenberg AA, Schuster A, Bronstein AM, Yaniv Y. 2021. Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis. Proc Natl Acad Sci USA. 118(24):e2020620118. - PMC - PubMed
1. Esteva A, et al. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 542(7639):115–118. - PMC - PubMed
1. Shen D, Wu G, Suk H-I. 2017. Deep learning in medical image analysis. Annu Rev Biomed Eng. 19:221–248. - PMC - PubMed
1. Suzuki K. 2017. Overview of deep learning in medical imaging. Radiol Phys Technol. 10(3):257–273. - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Artificial intelligence, explainability, and the scientific method: A proof-of-concept study on novel retinal biomarker discovery

Affiliations

Artificial intelligence, explainability, and the scientific method: A proof-of-concept study on novel retinal biomarker discovery

Authors

Affiliations

Abstract

Figures

References

LinkOut - more resources

Full Text Sources