Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec;41(12):1387-1397.
doi: 10.1002/cac2.12205. Epub 2021 Sep 14.

Novel strategy for disease risk prediction incorporating predicted gene expression and DNA methylation data: a multi-phased study of prostate cancer

Affiliations

Novel strategy for disease risk prediction incorporating predicted gene expression and DNA methylation data: a multi-phased study of prostate cancer

Chong Wu et al. Cancer Commun (Lond). 2021 Dec.

Abstract

Background: DNA methylation and gene expression are known to play important roles in the etiology of human diseases such as prostate cancer (PCa). However, it has not yet been possible to incorporate information of DNA methylation and gene expression into polygenic risk scores (PRSs). Here, we aimed to develop and validate an improved PRS for PCa risk by incorporating genetically predicted gene expression and DNA methylation, and other genomic information using an integrative method.

Methods: Using data from the PRACTICAL consortium, we derived multiple sets of genetic scores, including those based on available single-nucleotide polymorphisms through widely used methods of pruning and thresholding, LDpred, LDpred-funt, AnnoPred, and EBPRS, as well as PRS constructed using the genetically predicted gene expression and DNA methylation through a revised pruning and thresholding strategy. In the tuning step, using the UK Biobank data (1458 prevalent cases and 1467 controls), we selected PRSs with the best performance. Using an independent set of data from the UK Biobank, we developed an integrative PRS combining information from individual scores. Furthermore, in the testing step, we tested the performance of the integrative PRS in another independent set of UK Biobank data of incident cases and controls.

Results: Our constructed PRS had improved performance (C statistics: 76.1%) over PRSs constructed by individual benchmark methods (from 69.6% to 74.7%). Furthermore, our new PRS had much higher risk assessment power than family history. The overall net reclassification improvement was 69.0% by adding PRS to the baseline model compared with 12.5% by adding family history.

Conclusions: We developed and validated a new PRS which may improve the utility in predicting the risk of developing PCa. Our innovative method can also be applied to other human diseases to improve risk prediction across multiple outcomes.

Keywords: integrative models; polygenic risk scores; predicted DNA methylation; predicted gene expression; prostate cancer; risk prediction.

PubMed Disclaimer

Conflict of interest statement

No potential conflicts of interest were disclosed by the authors.

Figures

FIGURE 1
FIGURE 1
Study design and workflow. Multiple sets of genome‐wide polygenic risk scores (PRSs) were derived by combining summary association statistics from association studies using data of the PRACTICAL consortium and a reference panel of 45,216 males in the UK Biobank Phase I dataset. Candidate PRSs were derived using six strategies: 1) pruning and thresholding (P + T)– aggregation of independent polymorphisms that exceed a specified level of significance in the discovery genome‐wide association study (GWAS) (24 candidates); 2) LDpred computational algorithm, a Bayesian approach to calculate a posterior mean effect for all variants based on a prior (effect size in the prior GWAS) and subsequent shrinkage based on linkage disequilibrium (8 candidates); 3) AnnoPred (6 candidates); 4) LDpredfun (1 score); 5) EBPRS (1 score); and 6) revised P + T approach incorporating predicted gene expression (55 candidates for blood and 55 scores for prostate tissue) and DNA methylation (55 candidates for blood). For each of the above categories, the optimal PRS was chosen based on the area under the receiver‐operator curve (AUC) in the UK Biobank tuning dataset (1,458 prevalent cases and 1,467 controls). We then derived the integrative model combining information from constructed scores (1,467 prevalent cases and 1,458 controls). We subsequently tested the model performance in an independent UK Biobank testing dataset (4,832 incident cases and 142,869 controls)
FIGURE 2
FIGURE 2
Polygenic risk score assessment with incident cases. (A) Receiver operator characteristic curves and C statistics for different models in the independent testing dataset of 147,701 participants with 4,832 incident prostate cancer events. (B) The cumulative absolute risk of developing prostate cancer by quantiles of the overall polygenic score. The absolute risk was calculated based on UK incidence and mortality data and using the PRS relative risks estimated as described in the Material and Methods. The shaded part is 95% confidence interval. (C) The absolute risk of prostate cancer according to 100 groups of the testing cohort binned according to the percentile of the integrative polygenic risk score

Similar articles

Cited by

References

    1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020;70(1):7‐30. - PubMed
    1. Gann PH. Risk factors for prostate cancer. Rev Urol. 2002;4(5):S3‐S10. - PMC - PubMed
    1. Demichelis F, Stanford JL. Genetic predisposition to prostate cancer: update and future perspectives. Urol Oncol. 2015;33(2):75‐84. - PubMed
    1. Schumacher FR, Al Olama AA, Berndt SI, Benlloch S, Ahmed M, Saunders EJ, et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat Genet. 2018;50(7):928‐36. - PMC - PubMed
    1. Conti DV, Darst BF, Moss LC, Saunders EJ, Sheng X, Chou A, et al. Trans‐ancestry genome‐wide association meta‐analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction. Nat Genet. 2021;53(1):65‐75. - PMC - PubMed

Publication types

Grants and funding