Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 2;16(1):60.
doi: 10.1038/s41467-024-55198-7.

Machine learning derived retinal pigment score from ophthalmic imaging shows ethnicity is not biology

Collaborators, Affiliations

Machine learning derived retinal pigment score from ophthalmic imaging shows ethnicity is not biology

Anand E Rajesh et al. Nat Commun. .

Abstract

Few metrics exist to describe phenotypic diversity within ophthalmic imaging datasets, with researchers often using ethnicity as a surrogate marker for biological variability. We derived a continuous, measured metric, the retinal pigment score (RPS), that quantifies the degree of pigmentation from a colour fundus photograph of the eye. RPS was validated using two large epidemiological studies with demographic and genetic data (UK Biobank and EPIC-Norfolk Study) and reproduced in a Tanzanian, an Australian, and a Chinese dataset. A genome-wide association study (GWAS) of RPS from UK Biobank identified 20 loci with known associations with skin, iris and hair pigmentation, of which eight were replicated in the EPIC-Norfolk cohort. There was a strong association between RPS and ethnicity, however, there was substantial overlap between each ethnicity and the respective distributions of RPS scores. RPS decouples traditional demographic variables from clinical imaging characteristics. RPS may serve as a useful metric to quantify the diversity of the training, validation, and testing datasets used in the development of AI algorithms to ensure adequate inclusion and explainability of the model performance, critical in evaluating all currently deployed AI models. The code to derive RPS is publicly available at: https://github.com/uw-biomedical-ml/retinal-pigmentation-score .

PubMed Disclaimer

Conflict of interest statement

Competing interests: A.P.K. has acted as a paid consultant or lecturer to Abbvie, Aerie, Allergan, Google Health, Heidelberg Engineering, Novartis, Reichert, Santen,Thea and Topcon. A.Y.L. reports support from the US Food and Drug Administration, grants from Santen, Carl Zeiss Meditec, and Novartis, personal fees from Genentech, Topcon, and Verana Health, outside of the submitted work; This article does not reflect the opinions of the Food and Drug Administration. A.T. report grants from Bayer and Novartis and personal fees from Abbvie, Allegro, Annexon, Apellis, Bayer, Heidelberg Engineering, Iveric Bio, Kanghong, Novartis, Oxurion, Roche/Genentech, Thea. C.E. reports personal fees from Heidelberg Engineering, Boehringer Ingelheim, and Inozyme pharmaceuticals outside of the submitted work. P.A.K. has acted as a consultant for Retina Consultants of America, Topcon, Roche, Boehringer-Ingleheim, and Bitfount and is an equity owner in Big Picture Medical. He has received speaker fees from Zeiss, Novartis, Gyroscope, Boehringer-Ingleheim, Apellis, Roche, Abbvie, Topcon, and Hakim Group. He has received travel support from Bayer, Topcon, and Roche. He has attended advisory boards for Topcon, Bayer, Boehringer-Ingleheim, RetinAI, and Novartis. P.J.F. has acted as a consultant for Alphasights, GLG, Google Health, Guidepoint, PwC, Santen. A.B. is Founder and CEO of not-for-profit Peek Vision and receives a salary. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Schematic showing the method to generate the retinal pigmentation score (RPS) from a colour fundus image.
Input images are fed into the deep learning algorithm to generate segmentation masks. These are added together to make a retinal background mask, which is then transformed into L,a,b colorspace. The chromaticity vectors are then extracted and transformed by the principal component analysis model to create the RPS. Created with Biorender.com.
Fig. 2
Fig. 2. Representative fundus photos with associated RPS.
a Randomly sampled colour fundus photographs from each UK Biobank self-reported ethnicity and from the Tanzanian, Australian, and Chinese (ODIR) datasets, sorted by quintiles of retinal pigment score (RPS) across the entire distribution of RPS for the UK Biobank cohort. The RGB colour of the pixel value that is converted into RPS as well as the RPS is shown at the bottom of each fundus photograph. Black spaces represent when there are no suitable images within the respective ethnicity subgroup and quintile b Normalised kernel density estimation plot of the distribution of RPS for all participants grouped by self-reported ethnicity as reported in the UK Biobank as well as the Tanzanian, Australian, and Chinese (ODIR) datasets. Relative frequencies are normalised so the area under each curve is equal for each ethnicity subgroup. The subpanel consists of examples where for a given RPS and the a,b values in the CIELAB colour space are constant but the L vector changes. The x-axis is shared in both subpanels. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Manhattan plot of GWAS results from the discovery cohort (UKBiobank, n = 37067).
The Y-axis represents the two-sided p-values from the linear mixed effects model. Lead variants identified by GCTA-COJO are annotated with the nearest gene. Points are truncated at −log10(p) = 70 for clarity. The dashed red line indicates genome-wide significance (p = 5 × 10−8) which is adjusted for multiple comparisons and the p-values are two-sided and calculated with the z-statistic. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Comparison of betas for lead variants identified from the discovery and replication cohort.
Comparison of betas expressed as change in standard deviation of mean RPS for lead variants identified from the discovery (UK Biobank, n = 37067) genome-wide association study (GWAS) with their corresponding betas in the replication (EPIC-Norfolk, n = 4273) analysis, with 95% confidence intervals. Betas in the cohort were calculated using a generalised linear mixed model, adjusting for age, sex and the first ten principal components. P-values are two-sided, calculated from the z-statistic and corrected for multiple comparisons. Variants meeting the Bonferroni-adjusted replication significance threshold (p = 0.05/ variants) in the EPIC-Norfolk GWAS are shaded black. The nearest gene is annotated for variants achieving genome-wide significance. Source data are provided as a Source Data file.

Update of

References

    1. Flaxman, S. R. et al. Global causes of blindness and distance vision impairment 1990-2020: a systematic review and meta-analysis. Lancet Glob. Health5, e1221–e1234 (2017). - PubMed
    1. Wong, W. L. et al. Global prevalence of age-related macular degeneration and disease burden projection for 2020 and 2040: a systematic review and meta-analysis. Lancet Glob. Health2, e106–e116 (2014). - PubMed
    1. Teo, Z. L. et al. Global prevalence of diabetic retinopathy and projection of burden through 2045: Systematic review and meta-analysis. Ophthalmology128, 1580–1591 (2021). - PubMed
    1. Lee, A. Y. et al. Multicenter, head-to-head, real-world validation study of seven automated artificial intelligence diabetic retinopathy screening systems. Diabetes Care44, 1168–1175 (2021). - PMC - PubMed
    1. Tufail, A. et al. Automated diabetic retinopathy image assessment software: diagnostic accuracy and cost-effectiveness compared with human graders. Ophthalmology124, 343–351 (2017). - PubMed

Publication types

LinkOut - more resources