Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jun 9;13(6):e1006836.
doi: 10.1371/journal.pgen.1006836. eCollection 2017 Jun.

Joint modeling of genetically correlated diseases and functional annotations increases accuracy of polygenic risk prediction

Affiliations

Joint modeling of genetically correlated diseases and functional annotations increases accuracy of polygenic risk prediction

Yiming Hu et al. PLoS Genet. .

Abstract

Accurate prediction of disease risk based on genetic factors is an important goal in human genetics research and precision medicine. Advanced prediction models will lead to more effective disease prevention and treatment strategies. Despite the identification of thousands of disease-associated genetic variants through genome-wide association studies (GWAS) in the past decade, accuracy of genetic risk prediction remains moderate for most diseases, which is largely due to the challenges in both identifying all the functionally relevant variants and accurately estimating their effect sizes. In this work, we introduce PleioPred, a principled framework that leverages pleiotropy and functional annotations in genetic risk prediction for complex diseases. PleioPred uses GWAS summary statistics as its input, and jointly models multiple genetically correlated diseases and a variety of external information including linkage disequilibrium and diverse functional annotations to increase the accuracy of risk prediction. Through comprehensive simulations and real data analyses on Crohn's disease, celiac disease and type-II diabetes, we demonstrate that our approach can substantially increase the accuracy of polygenic risk prediction and risk population stratification, i.e. PleioPred can significantly better separate type-II diabetes patients with early and late onset ages, illustrating its potential clinical application. Furthermore, we show that the increment in prediction accuracy is significantly correlated with the genetic correlation between the predicted and jointly modeled diseases.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Prediction accuracy of non-infinitesimal models in simulated data.
We trained the models with equal training sample sizes (N1 = N2 = 28068, right panel) and unequal training sizes (N1 = 5000, N2 = 10000, left panel). Prediction accuracy was measured by correlation between simulated traits and predicted PRS.
Fig 2
Fig 2. Evaluating effectiveness of annotations and per-SNP heritability.
(A, B) Comparing signal strengths of SNPs with high and low heritability of related diseases in independent validation cohorts. Both SNPs with higher heritability of testing disease and related disease have significantly stronger associations across two independent and well-powered testing datasets (N>3,000, (A) Crohn’s disease; (B) Celiac disease.). P-values were calculated using one-sided Kolmogorov-Smirnov test. (C, D) Comparing consistency of SNPs’ effect direction between training and testing datasets. Each bar quantifies the proportion of SNPs with consistent effect directions. P-values were calculated using one-sided two-sample binomial test. (C) Crohn’s disease; (D) Celiac disease.
Fig 3
Fig 3. Prediction accuracy of the PleioPred-anno on T2D when jointly modeled with additional traits.
Genetic correlations were estimated using LDSC[28] and the significant correlations were labeled in purple. P-value and confidence region indicates the significant correlation between prediction accuracy and genetic correlation. The similar pattern was observed in infinitesimal and non-infinitesimal models without annotations (S1 Fig). AAM: age at menarche, AUT: autism spectrum, BIP: bipolar disorder, BMI: body mass index, BIL: birth length, BIW: birth weight, CHO: childhood obesity, CAD: coronary artery disease, FG: fasting glucose, HDL: HDL Cholesterol, MDD: major depressive disorder, RA: rheumatoid arthritis, and SCZ: schizophrenia.
Fig 4
Fig 4. Comparing non-infinitesimal methods in different standards.
(A) Enrichment of proportion of cases in testing samples with high PRS (top 1%, 5%, 10%, 20% and 30% risk groups stratified by PRS) in CD and CEL. (B) Distribution of age of onset of T2D in testing samples with high PRS (top 5%, 10%, 20% and 30% risk groups stratified by PRS) in T2D. P-values were calculated using Wilcoxon rank test comparing the two-trait models with the one-trait models. The last column represents the overall age of onset in testing samples.

References

    1. Chatterjee N, Shi J, Garcia-Closas M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat Rev Genet. 2016;advance online publication. http://www.nature.com/nrg/journal/vaop/ncurrent/abs/nrg.2016.27.html-sup.... - PMC - PubMed
    1. Li C, Yang C, Gelernter J, Zhao H. Improving genetic risk prediction by leveraging pleiotropy. Human genetics. 2014;133(5):639–50. doi: 10.1007/s00439-013-1401-5 - DOI - PMC - PubMed
    1. Maier R, Moser G, Chen G-B, Ripke S, Coryell W, Potash JB, et al. Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder. The American Journal of Human Genetics. 2015;96(2):283–94. doi: 10.1016/j.ajhg.2014.12.006 - DOI - PMC - PubMed
    1. Minnier J, Yuan M, Liu JS, Cai T. Risk classification with an adaptive naive bayes kernel machine model. Journal of the American Statistical Association. 2015;110(509):393–404. doi: 10.1080/01621459.2014.908778 - DOI - PMC - PubMed
    1. Speed D, Balding DJ. MultiBLUP: improved SNP-based prediction for complex traits. Genome research. 2014;24(9):1550–7. doi: 10.1101/gr.169375.113 - DOI - PMC - PubMed