Deep neural network improves the estimation of polygenic risk scores for breast cancer
- PMID: 33009504
- DOI: 10.1038/s10038-020-00832-7
Deep neural network improves the estimation of polygenic risk scores for breast cancer
Abstract
Polygenic risk scores (PRS) estimate the genetic risk of an individual for a complex disease based on many genetic variants across the whole genome. In this study, we compared a series of computational models for estimation of breast cancer PRS. A deep neural network (DNN) was found to outperform alternative machine learning techniques and established statistical algorithms, including BLUP, BayesA, and LDpred. In the test cohort with 50% prevalence, the Area Under the receiver operating characteristic Curve (AUC) were 67.4% for DNN, 64.2% for BLUP, 64.5% for BayesA, and 62.4% for LDpred. BLUP, BayesA, and LPpred all generated PRS that followed a normal distribution in the case population. However, the PRS generated by DNN in the case population followed a bimodal distribution composed of two normal distributions with distinctly different means. This suggests that DNN was able to separate the case population into a high-genetic-risk case subpopulation with an average PRS significantly higher than the control population and a normal-genetic-risk case subpopulation with an average PRS similar to the control population. This allowed DNN to achieve 18.8% recall at 90% precision in the test cohort with 50% prevalence, which can be extrapolated to 65.4% recall at 20% precision in a general population with 12% prevalence. Interpretation of the DNN model identified salient variants that were assigned insignificant p values by association studies, but were important for DNN prediction. These variants may be associated with the phenotype through nonlinear relationships.
References
-
- NIH. Female Breast Cancer—Cancer Stat Facts. https://seer.cancer.gov/statfacts/html/breast.html . Accessed 03 Dec 2019.
-
- Nelson HD, Tyne K, Naik A, Bougatsos C, Chan BK, Humphrey L. Screening for breast cancer: an update for the U.S. Preventive Services Task Force. Ann Intern Med. 2009;151:727–37. https://doi.org/10.7326/0003-4819-151-10-200911170-00009 . - DOI - PubMed - PMC
-
- Oeffinger KC, Fontham ETH, Etzioni R, Herzig A, Michaelson JS, Shih Y-CT, et al. Breast Cancer Screening for Women at Average Risk: 2015 Guideline Update From the American Cancer Society. J AMA. 2015;314:1599–614.
-
- Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013;9:e1003348. https://doi.org/10.1371/journal.pgen.1003348 .
-
- Clark SA, Kinghorn BP, Hickey JM, van der Werf JH. The effect of genomic information on optimal contribution selection in livestock breeding programs. Genet Select. Evol. 2013;45:44. https://doi.org/10.1186/1297-9686-45-44 .
MeSH terms
Substances
LinkOut - more resources
Other Literature Sources
Medical