Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021;116(533):133-143.
doi: 10.1080/01621459.2020.1764849. Epub 2020 Oct 12.

A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information

Affiliations

A penalized regression framework for building polygenic risk models based on summary statistics from genome-wide association studies and incorporating external information

Ting-Huei Chen et al. J Am Stat Assoc. 2021.

Abstract

Large-scale genome-wide association (GWAS) studies provide opportunities for developing genetic risk prediction models that have the potential to improve disease prevention, intervention or treatment. The key step is to develop polygenic risk score (PRS) models with high predictive performance for a given disease, which typically requires a large training data set for selecting truly associated single nucleotide polymorphisms (SNPs) and estimating effect sizes accurately. Here, we develop a comprehensive penalized regression for fitting l 1 regularized regression models to GWAS summary statistics. We propose incorporating Pleiotropy and ANnotation information into PRS (PANPRS) development through suitable formulation of penalty functions and associated tuning parameters. Extensive simulations show that PANPRS performs equally well or better than existing PRS methods when no functional annotation or pleiotropy is incorporated. When functional annotation data and pleiotropy are informative, PANPRS substantially outperforms existing PRS methods in simulations. Finally, we applied our methods to build PRS for type 2 diabetes and melanoma and found that incorporating relevant functional annotations and GWAS of genetically related traits improved prediction of these two complex diseases.

Keywords: Genome wide association study; Lasso; genetic pleiotropy; genetic risk prediction; polygenic risk score; summary statistics.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Concordance between Lasso and PANPRS. Figures A, B and C are for a quantitative trait (linear regression). Figures D, E and F are for a binary trait (logistic regression). For Figures A, B, D and E, the x-coordinate is the β values based on Lasso with individual level data; the y-coordinate is the β values estimated based on PANPRS. For Figures C and F, the x-coordinate is the number of nonzero estimates of regression coefficients of Lasso model; the y-coordinate is the correlation between the β values estimated based on Lasso and PANPRS. Figure A and D: Numerical experiment for M = 200 SNPs on chromosome 1. Df denotes the number of non-zero coefficients in Lasso estimates. Figures B and E: Numerical experiment for M = 213,240 SNPs.

References

    1. Chatterjee N, Wheeler B, Sampson J, Hartge P, Chanock SJ, Park JH. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat Genet. 2013;45(4):400–405. - PMC - PubMed
    1. Chatterjee N, Shi J, Garcia-Closas M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat Rev Genet. 2016;17(7):392–406. - PMC - PubMed
    1. Dudbridge F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 2013;9(3):e1003348. - PMC - PubMed
    1. Kruppa J, Ziegler A, Konig IR. Risk estimation and risk prediction using machine-learning methods. Hum Genet. 2012;131(10):1639–1654. - PMC - PubMed
    1. Golan D, Rosset S. Effective Genetic-Risk Prediction Using Mixed Models. Am J Hum Genet. 2014;95(4):383–393. - PMC - PubMed

LinkOut - more resources