Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 15;15(1):4010.
doi: 10.1038/s41467-024-48017-6.

Identifying proteomic risk factors for cancer using prospective and exome analyses of 1463 circulating proteins and risk of 19 cancers in the UK Biobank

Affiliations

Identifying proteomic risk factors for cancer using prospective and exome analyses of 1463 circulating proteins and risk of 19 cancers in the UK Biobank

Keren Papier et al. Nat Commun. .

Abstract

The availability of protein measurements and whole exome sequence data in the UK Biobank enables investigation of potential observational and genetic protein-cancer risk associations. We investigated associations of 1463 plasma proteins with incidence of 19 cancers and 9 cancer subsites in UK Biobank participants (average 12 years follow-up). Emerging protein-cancer associations were further explored using two genetic approaches, cis-pQTL and exome-wide protein genetic scores (exGS). We identify 618 protein-cancer associations, of which 107 persist for cases diagnosed more than seven years after blood draw, 29 of 618 were associated in genetic analyses, and four had support from long time-to-diagnosis ( > 7 years) and both cis-pQTL and exGS analyses: CD74 and TNFRSF1B with NHL, ADAM8 with leukemia, and SFTPA2 with lung cancer. We present multiple blood protein-cancer risk associations, including many detectable more than seven years before cancer diagnosis and that had concordant evidence from genetic analyses, suggesting a possible role in cancer development.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1
Fig. 1. Summary of study design, observational and genetic protein-cancer associations, and pathway analyses and drug target mapping.
cis-pQTL - cis protein quantitative trait loci, PRS – polygenic risk score, SNP – single nucleotide polymorphism, ENT – effective number of tests. Source data are provided as a Source Data file.
Fig. 2
Fig. 2. Volcano plot for the prospective associations of circulating proteins with risk of cancer.
Volcano plot displaying the results from the prospective observational analyses of 1463 proteins with cancer risk. Top protein-cancer associations plotted with point size indicating the number of ENT significant protein-cancer associations. The point colour represents the cancer site. Hazard ratios per SD for cancer risk are plotted on the x-axis while –log10 p-values are plotted on the y-axis. Protein names and hazard ratios are labelled to highlight a selection of associations significant after correction for multiple testing (p < 0.05/639). Hazard ratios and 95% confidence intervals for each cancer site were separately estimated using two-sided Cox proportional hazards regression models. N- number, ENT – effective number of tests. Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Volcano plots for the prospective association of circulating proteins with risk of cancer by time to diagnosis.
Two volcano plots displaying the results from prospective observational analyses of 1463 proteins with cancer risk stratified by time from blood draw to diagnosis, with analyses among cases diagnosed within three years of blood draw (left) and after seven years of blood draw (right). Top protein-cancer associations plotted with point size indicating the number of ENT significant protein-cancer associations. The point colour represents the cancer site. Hazard ratios for cancer risk per SD are plotted on the x-axis while –log10 p-values are plotted on the y-axis. Protein names and hazard ratios are labelled to highlight a selection of associations significant after correction for multiple testing (p < 0.05/639). Hazard ratios and 95% confidence intervals for each cancer site were separately estimated using two-sided Cox proportional hazards regression models. N- number, ENT – effective number of tests. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Mirror Manhattan plot for the association of genetically predicted protein concentrations and cancer risk using cis-pQTL and exome scores.
This mirror Manhattan plot displays the results of each cis-pQTL (top) in the full exome-sequencing cohort within the UK Biobank across European samples for proteins passing correction for multiple testing in the observational results on cancer risk. The y-axis represents the -log10 p-values. The bottom of this plot contains the exome-wide score results for genetically predicted proteins. Markers coloured in grey represent results that did not reach the conventional p < 0.05 significance threshold, while markers in blue represent conventionally significant results. If a cis-variant or an exome-wide score passes Bonferroni significance, those markers are coloured by the cancer site of association. Odds ratios were estimated using logistic regression models to investigate the association of each genetically predicted protein with cancer risk per standard deviation increase. Cis-variants were adjusted to be on the same scale. cis-pQTL - cis protein quantitative trait loci, NHL – Non-Hodgkin lymphoma. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. The prospective and genetic associations of SFPTA2 with lung cancer risk, CD74 and TNFRSF1B with risk of non-Hodgkin lymphoma, and ADAM8 with risk of leukaemia.
Plots show  the associations for the four proteins that were associated with the risk of cancer in the main analyses and that had directionally concordant, conventionally significant support from all three additional analyses, i.e., long (>7 years) time-to-diagnosis, cis-pQTL, and exGS analyses. For each protein–cancer association evidence for the association of concentrations with cancer risk is presented from minimally and fully adjusted models per SD, as well as models stratified by time-to-diagnosis, and from exome proteins score and cis-pQTL analyses. The observational analyses (minimally adjusted, fully adjusted models, and time-to-diagnosis analyses were conducted in a maximal sample of 44,645 participants, and the genetic analyses were conducted in a maximal sample of 336,823 UK participants. Data are presented as relative risk and 95% confidence intervals. The reference value is 1.0. cis-pQTL-cis protein quantitative trait loci. Source data are provided as a Source data file.

References

    1. Knuppel A, et al. Circulating insulin-like growth factor-I concentrations and risk of 30 cancers: prospective analyses in UK Biobank. Cancer Res. 2020;80:4014–4021. doi: 10.1158/0008-5472.CAN-20-1281. - DOI - PubMed
    1. Watts EL, et al. Circulating insulin-like growth factors and risks of overall, aggressive and early-onset prostate cancer: a collaborative analysis of 20 prospective studies and Mendelian randomization analysis. Int. J. Epidemiol. 2023;52:71–86. doi: 10.1093/ije/dyac124. - DOI - PMC - PubMed
    1. Smith Byrne K, et al. The role of plasma microseminoprotein-beta in prostate cancer: an observational nested case-control and Mendelian randomization study in the European prospective investigation into cancer and nutrition. Ann. Oncol. 2019;30:983–989. doi: 10.1093/annonc/mdz121. - DOI - PMC - PubMed
    1. Menon U, et al. Ovarian cancer population screening and mortality after long-term follow-up in the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS): a randomised controlled trial. Lancet. 2021;397:2182–2193. doi: 10.1016/S0140-6736(21)00731-5. - DOI - PMC - PubMed
    1. Integrative Analysis of Lung Cancer Etiology and Risk (INTEGRAL) Consortium for Early Detection of Lung Cancer. Assessment of Lung Cancer Risk on the Basis of a Biomarker Panel of Circulating Proteins. JAMA Oncol. 2018;4:e182078. doi: 10.1001/jamaoncol.2018.2078. - DOI - PMC - PubMed