. 2022 Jul;129(7):e69-e76.

doi: 10.1016/j.ophtha.2022.02.008. Epub 2022 Feb 12.

Artificial Intelligence for Retinopathy of Prematurity: Validation of a Vascular Severity Scale against International Expert Diagnosis

J Peter Campbell¹, Michael F Chiang², Jimmy S Chen³, Darius M Moshfeghi⁴, Eric Nudleman⁵, Paisan Ruambivoonsuk⁶, Hunter Cherwek⁷, Carol Y Cheung⁸, Praveer Singh⁹, Jayashree Kalpathy-Cramer⁹, Susan Ostmo³, Malvina Eydelman¹⁰, R V Paul Chan¹¹, Antonio Capone Jr¹²; Collaborative Community in Ophthalmic Imaging Executive Committee and the Collaborative Community in Ophthalmic Imaging Retinopathy of Prematurity Workgroup

Collaborators, Affiliations

Collaborators

Collaborative Community in Ophthalmic Imaging Executive Committee and the Collaborative Community in Ophthalmic Imaging Retinopathy of Prematurity Workgroup:
Audina Berrocal, Gil Binenbaum, Michael Blair, J Peter Campbell, Antonio Capone Jr, R V Paul Chan, Yi Chen, Michael F Chiang, Shuan Dai, Anna Ells, Alistair Fielder, Brian Fleck, William Good, Mary Elizabeth Hartnett, Gerd Holmstrom, Shunji Kusaka, Andres Kychenthal, Domenico Lepore, Birgit Lorenz, Maria Ana Martinez-Castellanos, Sengul Ozdek, Dupe Popoola, Graham Quinn, James Reynolds, Parag Shah, Michael Shapiro, Andreas Stahl, Cynthia Toth, Anand Vinekar, Linda Visser, David Wallace, Wei-Chi Wu, Peiquan Zhao, Andrea Zin, M Ichael Abramoff, Mark Blumenkranz, Malvina Eydelman, David Myung, Joel S Schuman, Carol Shields, Aaron Lee, Michael Repka, Michael F Chiang, J Peter Campbell, Darius M Moshfeghi, Eric Nudleman, Paisan Ruamviboonsuk, D Hunter Cherwek, Carol Y Cheung, R V Paul Chan, Antonio Capone Jr

Affiliations

¹ Casey Eye Institute, Department of Ophthalmology, Oregon Health & Science University, Portland, Oregon. Electronic address: campbelp@ohsu.edu.
² National Eye Institute, National Institutes of Health, Bethesda, Maryland.
³ Casey Eye Institute, Department of Ophthalmology, Oregon Health & Science University, Portland, Oregon.
⁴ Byers Eye Institute, Horngren Family Vitreoretinal Center, Department of Ophthalmology, Stanford University, Palo Alto, California.
⁵ Department of Ophthalmology, University of California, San Diego, California.
⁶ Department of Ophthalmology, Rajavithi Hospital, Bangkok, Thailand.
⁷ Orbis International, New York, New York.
⁸ Department of Ophthalmology and Visual Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China.
⁹ Department of Radiology, MGH/Harvard Medical School, Charlestown, Massachusetts; Massachusetts General Hospital & Brigham and Women's Hospital Center for Clinical Data Science, Boston, Massachusetts.
¹⁰ Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, Maryland.
¹¹ Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago, Illinois.
¹² Associated Retinal Consultants, Oakland University William Beaumont School of Medicine, Royal Oak, Michigan.

PMID: 35157950
PMCID: PMC9232863
DOI: 10.1016/j.ophtha.2022.02.008

Artificial Intelligence for Retinopathy of Prematurity: Validation of a Vascular Severity Scale against International Expert Diagnosis

J Peter Campbell et al. Ophthalmology. 2022 Jul.

. 2022 Jul;129(7):e69-e76.

doi: 10.1016/j.ophtha.2022.02.008. Epub 2022 Feb 12.

Authors

Collaborators

Collaborative Community in Ophthalmic Imaging Executive Committee and the Collaborative Community in Ophthalmic Imaging Retinopathy of Prematurity Workgroup:
Audina Berrocal, Gil Binenbaum, Michael Blair, J Peter Campbell, Antonio Capone Jr, R V Paul Chan, Yi Chen, Michael F Chiang, Shuan Dai, Anna Ells, Alistair Fielder, Brian Fleck, William Good, Mary Elizabeth Hartnett, Gerd Holmstrom, Shunji Kusaka, Andres Kychenthal, Domenico Lepore, Birgit Lorenz, Maria Ana Martinez-Castellanos, Sengul Ozdek, Dupe Popoola, Graham Quinn, James Reynolds, Parag Shah, Michael Shapiro, Andreas Stahl, Cynthia Toth, Anand Vinekar, Linda Visser, David Wallace, Wei-Chi Wu, Peiquan Zhao, Andrea Zin, M Ichael Abramoff, Mark Blumenkranz, Malvina Eydelman, David Myung, Joel S Schuman, Carol Shields, Aaron Lee, Michael Repka, Michael F Chiang, J Peter Campbell, Darius M Moshfeghi, Eric Nudleman, Paisan Ruamviboonsuk, D Hunter Cherwek, Carol Y Cheung, R V Paul Chan, Antonio Capone Jr

Affiliations

¹ Casey Eye Institute, Department of Ophthalmology, Oregon Health & Science University, Portland, Oregon. Electronic address: campbelp@ohsu.edu.
² National Eye Institute, National Institutes of Health, Bethesda, Maryland.
³ Casey Eye Institute, Department of Ophthalmology, Oregon Health & Science University, Portland, Oregon.
⁴ Byers Eye Institute, Horngren Family Vitreoretinal Center, Department of Ophthalmology, Stanford University, Palo Alto, California.
⁵ Department of Ophthalmology, University of California, San Diego, California.
⁶ Department of Ophthalmology, Rajavithi Hospital, Bangkok, Thailand.
⁷ Orbis International, New York, New York.
⁸ Department of Ophthalmology and Visual Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China.
⁹ Department of Radiology, MGH/Harvard Medical School, Charlestown, Massachusetts; Massachusetts General Hospital & Brigham and Women's Hospital Center for Clinical Data Science, Boston, Massachusetts.
¹⁰ Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, Maryland.
¹¹ Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago, Illinois.
¹² Associated Retinal Consultants, Oakland University William Beaumont School of Medicine, Royal Oak, Michigan.

PMID: 35157950
PMCID: PMC9232863
DOI: 10.1016/j.ophtha.2022.02.008

Abstract

Purpose: To validate a vascular severity score as an appropriate output for artificial intelligence (AI) Software as a Medical Device (SaMD) for retinopathy of prematurity (ROP) through comparison with ordinal disease severity labels for stage and plus disease assigned by the International Classification of Retinopathy of Prematurity, Third Edition (ICROP3), committee.

Design: Validation study of an AI-based ROP vascular severity score.

Participants: A total of 34 ROP experts from the ICROP3 committee.

Methods: Two separate datasets of 30 fundus photographs each for stage (0-5) and plus disease (plus, preplus, neither) were labeled by members of the ICROP3 committee using an open-source platform. Averaging these results produced a continuous label for plus (1-9) and stage (1-3) for each image. Experts were also asked to compare each image to each other in terms of relative severity for plus disease. Each image was also labeled with a vascular severity score from the Imaging and Informatics in ROP deep learning system, which was compared with each grader's diagnostic labels for correlation, as well as the ophthalmoscopic diagnosis of stage.

Main outcome measures: Weighted kappa and Pearson correlation coefficients (CCs) were calculated between each pair of grader classification labels for stage and plus disease. The Elo algorithm was also used to convert pairwise comparisons for each expert into an ordered set of images from least to most severe.

Results: The mean weighted kappa and CC for all interobserver pairs for plus disease image comparison were 0.67 and 0.88, respectively. The vascular severity score was found to be highly correlated with both the average plus disease classification (CC = 0.90, P < 0.001) and the ophthalmoscopic diagnosis of stage (P < 0.001 by analysis of variance) among all experts.

Conclusions: The ROP vascular severity score correlates well with the International Classification of Retinopathy of Prematurity committee member's labels for plus disease and stage, which had significant intergrader variability. Generation of a consensus for a validated scoring system for ROP SaMD can facilitate global innovation and regulatory authorization of these technologies.

Keywords: Artificial intelligence; Deep learning; Disease classification; Interobserver agreement; Retinopathy of prematurity; Severity score.

PubMed Disclaimer

Figures

**Figure 1.. Example images for determination of plus disease and stage by members of the international classification of retinopathy of prematurity (ICROP) committee.**
The two images at the top are from the plus disease database, and the bottom two images are from the stage dataset. The white numbers represent the number of committee experts who assigned each label for plus or stage to each image, respectively. The colored numbers in the upper right of each image represent the averaged expert classification on a scale of 1–3 for stage, and 1–9 for plus. Both plus and stage appear to present on a continuum, which can be measured by comparing expert labels. Note that the lower right quadrant image is one of the standard images of “Stage 1” disease published in the 2005 ICROP revisited paper, which may be an example of temporal diagnostic drift.

**Figure 2.. Spectrum of disease severity for plus and stage in retinopathy of prematurity.**
In each case, the middle portion of the figure represents the individual expert labels for each image in the dataset for plus (N=30), and Stage (N=28). Each row represents one image, and the columns in the “Expert” section depict individual expert classifications. Experts were ranked in order of least aggressive diagnosis to most aggressive diagnosis from left to right. Images were ranked from least severe to most severe by average expert classification. Color code represents the underlying class label from green to red in order of increasing severity (no plus, pre-plus, plus or stage 0, 1, 2, 3 or 4). The ordinal column represents the mode classification, reflecting the current ICROP classification schema, and the Average column represents the average disease classification severity, from the individual ICROP experts. Average disease severity better reflects expert diagnosis compared to an ordinal classification system.

**Figure 3.. Classification versus comparison agreement.**
A) Interexpert agreement on plus disease label for 34 experts. Inset legend reports weighted kappa color scale for pairwise agreement for each expert relative to each other. Mean weighted kappa for all inter-observer pairs 0.67. B) Interexpert agreement for overall disease rankings for relative disease severity for 34 experts, as measured by correlation coefficient (CC). Mean CC for all inter-observer pairs 0.88. C) Correlation between average disease severity according to ordinal labels of 34 experts versus rank ordered severity using relative rankings (CC 0.96)

Figure 4.. Relationship between deep learning derived vascular severity score (VSS) and the mode plus classification, average plus classification, and associated ophthalmoscopic diagnosis of stage in plus disease dataset.
A) Box plot of VSS vs mode plus disease classification (P<0.001). B) Scatter plot of VSS vs average disease severity classification (correlation coefficient 0.90). C) Box plot of VSS from plus disease images compared with ophthalmoscopic diagnosis of stage in the same eyes (P<0.001). The VSS corresponds to the current mode classification of plus disease, a continuous spectrum of plus disease as determined by expert classifications, and with the ophthalmoscopic diagnosis of stage in the same eyes (not shown on images).

See this image and copyright information in PMC

References

1. Abramoff MD, Cunningham B, Patel B, et al. Foundational Considerations for Artificial Intelligence Utilizing Ophthalmic Images. Ophthalmology 2021. Available at: https://www.sciencedirect.com/science/article/pii/S0161642021006436. - PMC - PubMed
1. U.S. Food & Drug Administration (FDA) Digital Health Center of Excellence, Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan. Available at: https://www.fda.gov/media/145022/download.
1. Collaborative Community on Ophthalmic Imaging. Available at: https://www.cc-oi.org/ [Accessed August 31, 2021].
1. Blencowe H, Lawn JE, Vazquez T, et al. Preterm-associated visual impairment and estimates of retinopathy of prematurity at regional and global levels for 2010. Pediatr Res 2013;74:35–49. - PMC - PubMed
1. Blencowe H, Cousens S, Chou D, et al. Born Too Soon: The global epidemiology of 15 million preterm births. Reprod Health 2013;10:S2. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Artificial Intelligence for Retinopathy of Prematurity: Validation of a Vascular Severity Scale against International Expert Diagnosis

Collaborators

Affiliations

Artificial Intelligence for Retinopathy of Prematurity: Validation of a Vascular Severity Scale against International Expert Diagnosis

Authors

Collaborators

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources