Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct 31;16(1):9636.
doi: 10.1038/s41467-025-64634-1.

Proteome atlas for mechanistic discovery and risk prediction of diabetic retinopathy

Affiliations

Proteome atlas for mechanistic discovery and risk prediction of diabetic retinopathy

Shaopeng Yang et al. Nat Commun. .

Abstract

Proteomics offers an unprecedented opportunity to characterize and predict diabetic retinopathy (DR) with minimal invasiveness. Here we examine this in 10,873 individuals with (pre)diabetes from two ethnically distinct cohorts. By simultaneous profiling of ~3000 proteins, we identify 668 associations with mechanistically plausible directionality that constitute a comprehensive DR proteomic landscape with linkages to retinal tomographic structure and genetic predisposition, pointing to established and novel biological pathways conferring DR risk. Integrating DR proteomic profile markedly improves predictive performance beyond clinical and genetic predictors, with plexin B2, growth differentiation factor 15, and renin emerging as top proteins validated across cohorts and linked to retinal microvascular degeneration in Guangzhou Diabetic Eye Study (GDES) based on SS-OCTA. A parsimonious panel of these three proteins alone achieves comparable performance in predicting DR development and progression, while renin is confirmed as a causal promoter through genetic analyses. Our findings highlight the potential of large-scale proteomics in elucidating DR pathogenesis and advancing biomarker discovery, with broad implications for early detection and intervention.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of the study design.
1, 2 A total of 10,521 UK Biobank (UKB) participants without DR at baseline were included to identify DR-associated proteins, characterize biological pathways, and develop machine learning-based proteomic state models for predicting incident DR. An additional 352 participants from the Guangzhou Diabetic Eye Study (GDES) were included for independent replication. 3 Dysregulated protein profiles across 2923 blood proteins were assessed, with subgroup analyses exploring differences by genetic predisposition and sex. 4 Associations between DR-associated proteins, retinal tomographic structure, and genetic predisposition were examined. 5 The identified proteins revealed both established and novel biological pathways underlying DR risk. 6 Machine learning models were trained to predict incident DR, while (7) protein importance ranking identified the top contributors to prediction. 8 The utility of integrating proteomic profiles and top-ranked proteins into clinical and polygenic risk models was evaluated. 9 External validation of the proteomic profiles, their associations with retinal structure, and predictive improvements were conducted in GDES. Leveraging annual retinal microvascular phenotyping data from GDES participants, 10 we investigated the impact of DR-associated proteins on the dynamics of microvascular degeneration. 11 To infer causality, we conducted two-sample Mendelian randomization (MR) for the top-ranked proteins and their association with incident DR. Created in BioRender. Wang, W. (https://BioRender.com/30gbrhl). Parts of panel (1) was created from Flaticon (https://flaticon.com). DR diabetic retinopathy; SCP superficial capillary plexus; DCP deep capillary plexus; VD vessel density; GWAS genome-wide association study; pQTL protein quantitative trait loci; SNP single nucleotide polymorphism; MR Mendelian randomization.
Fig. 2
Fig. 2. Blood proteins associated with DR risk and retinal structure.
(a–d), Associations of 2923 proteins with incident DR. Hazard ratios for incident DR per 1-SD change in protein levels were estimated using Cox proportional hazards models (n = 10,521). Model 1 (a) was adjusted for age, sex, and ethnicity, while model 2 (b) included additional adjustments for HbA1c, duration of diabetes, BMI, SBP, and status of diabetes (prediabetes/diabetes). Sensitivity analyses were conducted by additionally adjusted for medication use (c) and excluding DR within the initial two years of follow-up (d). P values were calculated using two-sided Wald tests after controlling false discovery rate (FDR) for multiple testing. e Subgroup analyses stratified by polygenic risk and sex using model 2. Participants who completed proteomic profiling and genotyping were dichotomized at the PRS median high-risk (n = 5261) and low-risk (n = 5260) groups. Sex-specific analyses included 5267 male and 5254 female participants. P values were calculated using two-sided Wald tests, and asterisks denote significant associations after controlling FDR for multiple testing independently within each stratum (***P  <  0.001, **P  <  0.01, *P  <  0.05). To enhance visual contrast and facilitate cross-protein comparison, hazard ratios were log-transformed and linearly rescaled to span the full colour scale. Original (unscaled) results are reported in Supplementary Tables S7–S8 and S10–S11. f Associations of DR-associated proteins with retinal structure. Coefficients for retinal tomographic measurements per 1-SD change in protein levels were estimated using linear models adjusted for age, sex, HbA1c, duration of diabetes, BMI, and SBP (n = 3397). P values were calculated using two-sided Student’s t-tests after controlling FDR for multiple testing. Asterisks denote significance as above. g Subgroup analyses by polygenic risk and sex for the associations of DR-associated proteins with PL thickness. Coefficients were linearly rescaled as above, and original (unscaled) results are reported in Supplementary Tables S17–S20. Source data are provided as a Source Data file. DR diabetic retinopathy, RPE retinal pigment epithelium, PL photoreceptor layer, GC-IPL ganglion cell-inner plexus layer, RNFL retinal nerve fibre layer.
Fig. 3
Fig. 3. Genetic risk and functional analyses of DR-associated proteins.
a Patterns of blood protein alignments and disalignments with polygenic risk and DR incidence. Coefficients for protein levels per 1-SD change in polygenic risk scores were estimated using linear models adjusted for age, sex, HbA1c, duration of diabetes, BMI, and SBP (n = 10,521). Data are presented as estimated coefficients (squares) with 95% confidence intervals (CIs) indicated by error bars. Solid blocks and asterisks denote significant associations based on two-sided Student’s t-tests with false discovery rate (FDR) correction for multiple testing. b Enrichment analysis for Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Reactome pathways. Proteins significantly associated with DR after multiple testing correction in model 2 were analysed using the full panel of Olink proteins as the reference. P values were calculated using two-sided Fisher’s exact tests after controlling FDR for multiple testing. The numbers above each bar represent the count of proteins observed in each pathway. c Protein–protein interaction network of DR-associated proteins. Nodes represent individual proteins, with darker shades indicating higher degrees. The width and colour of edges connecting the nodes represent the combined interaction scores, with thicker and darker lines signifying stronger evidence of interaction. To improve readability and highlight the most robust connections, only edges with a standardized interaction score > 0.85 are displayed. Source data are provided as a Source Data file. DR diabetic retinopathy.
Fig. 4
Fig. 4. Performance of DR-associated proteins for stratifying and predicting future DR risk.
a, d Cumulative DR hazard over the follow-up period (n = 10,521), stratified by proteomic state tertiles constructed using the full protein panels (a) and the cardiometabolic panel only (d). Data are presented as observed event frequencies with 95% confidence intervals (CIs) shown as shading derived from survival proportions. High risk corresponds to the top tertile, middle risk to the middle tertile, and low risk to the bottom tertile. b, e Receiver operating characteristic curves comparing the clinical risk factor model, polygenic risk model, clinical risk factors + polygenic risk model, clinical risk factors + proteomic states, polygenic risk + proteomic states, and clinical risk factors + polygenic risk + proteomic states for predicting future DR (n = 10,521). Proteomic states were constructed using the full protein panels (b) and the cardiometabolic panel only (e). The dashed grey line indicates random classifier. Stratified performance across polygenic risk and sex are also presented. (c and f), Standardized net benefit curves for the same models predicting future DR (n = 10,521). Proteomic states were constructed using the full protein panels (c) and the cardiometabolic panel only (f). The horizontal dashed grey line indicates “treat none,” while the vertical grey line indicates “treat all.” Stratified performances across polygenic risk and sex are also shown. Source data are provided as a Source Data file. DR diabetic retinopathy.
Fig. 5
Fig. 5. Protein importance ranking and performance of PLXNB2, GDF15, and REN for stratifying and predicting future DR risk.
a, b Importance ranking and SHapley Additive exPlanations (SHAP) visualization of key proteins. The bar plot (a) indicates the ranked importance of proteins based on their contributions to DR prediction as determined by the SHAP values, and the line with 95% confidence intervals (CIs) shown as shading represents each protein when assessed individually (n = 10,521). The three top-ranked proteins are highlighted in red. Gradient colours in the SHAP summary plot (b) indicate the magnitude of individual protein contributions to prediction. c Cumulative DR hazard over the follow-up period (n = 10,521), stratified by tertiles of the three-protein states. Data are presented as observed event frequencies with 95% CIs shown as shading derived from survival proportions. High risk corresponds to the top tertile, middle risk to the middle tertile, and low risk to the bottom tertile. d Distribution of PLXNB2, GDF15, and REN levels in baseline blood samples from individuals who developed DR (n = 425) versus those who did not (n = 10,096) over the follow-up. Data are presented as density clouds showing the full distribution, with overlaid box plots indicating the median (centre line), interquartile range (bounds of the box), and whiskers extending to the minimum and maximum within 1.5× the interquartile range. Individual data points, including minima and maxima, are shown as dots. P values were calculated using two-sided Student’s t-tests without multiple comparison adjustments. e Receiver operating characteristic curves comparing the clinical risk factor model, polygenic risk model, clinical risk factors + polygenic risk model, clinical risk factors + 3-protein states, polygenic risk + 3-protein states, and clinical risk factors + polygenic risk + 3-protein states for predicting future DR (n = 10,521). The dashed grey line represents random classifier. f Standardized net benefit curves of the same models for predicting future DR (n = 10,521). The horizontal dashed grey line indicates “treat none,” and the vertical grey line indicates “treat all.” Source data are provided as a Source Data file. DR diabetic retinopathy.
Fig. 6
Fig. 6. Replication and extrapolation in an ethnically distinct cohort.
a Associations of 367 proteins with incident DR, estimated using Cox proportional hazards (CPH) models adjusted for age, sex, ethnicity, HbA1c, duration of diabetes, BMI, and SBP (n = 352). P values were calculated using two-sided Wald tests without multiple comparison adjustments. b Associations of DR-associated proteins with retinal structure (n = 352), estimated using linear models adjusted for the same covariates. P values were calculated using two-sided Student’s t-tests after controlling false discovery rate (FDR) for multiple testing. Significant associations are marked with asterisks (***P  <  0.001, **P  <  0.01, *P  <  0.05). c, e Receiver operating characteristic curves comparing the clinical risk factor model with models incorporating proteomic or 3-protein states for predicting future DR (n = 352). Shaded areas indicate performance gaps. d, h Distribution of PLXNB2, GDF15, and REN levels in baseline blood samples among individuals who developed DR (n = 36) and those who did not (n = 316) d and among those who experienced DR progression (n = 15) and those who did not (n = 337) (h). Data are presented as density clouds showing the full distribution, with overlaid box plots indicating the median (centre line), interquartile range (bounds of the box), and whiskers extending to the minimum and maximum within 1.5× the interquartile range. Individual data points, including minima and maxima, are shown as dots. P values were calculated using two-sided Student’s t-tests without multiple comparison adjustments. f Associations of 367 proteins with DR progression, estimated using CPH models adjusted for the same covariates. P values were calculated using two-sided Wald tests without multiple comparison adjustments. g, i Receiver operating characteristic curves comparing the clinical risk factor model with models incorporating proteomic or 3-protein states for predicting DR progression (n = 352). j Causal relationship between REN and incident DR. GWAS summary statistics for DR included 10,413 DR cases and 308,633 controls. Data are presented as estimated SNP effects (dots) with 95% confidence intervals (CIs) indicated by error bars. Source data are provided as a Source Data file. DR diabetic retinopathy, pQTL protein quantitative trait loci.

References

    1. Prince, M. J. et al. The burden of disease in older people and implications for health policy and practice. Lancet385, 549–562 (2015). - DOI - PubMed
    1. The Lancet. Beat diabetes: an urgent call for global action. Lancet387, 1483 (2016). - PubMed
    1. GBD 2021 Diabetes Collaborators Global, regional, and national burden of diabetes from 1990 to 2021, with projections of prevalence to 2050: a systematic analysis for the Global Burden of Disease Study 2021. Lancet402, 203–234 (2023). - DOI - PMC - PubMed
    1. Cheung, N., Mitchell, P. & Wong, T. Y. Diabetic retinopathy. Lancet376, 124–136 (2010). - DOI - PubMed
    1. Stitt, A. W. et al. The progress in understanding and treatment of diabetic retinopathy. Prog. Retin. Eye Res.51, 156–186 (2016). - DOI - PubMed

MeSH terms

LinkOut - more resources