Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan;180(1):80-85.
doi: 10.1002/ajmg.b.32705. Epub 2018 Dec 4.

Predictive modeling of schizophrenia from genomic data: Comparison of polygenic risk score with kernel support vector machines approach

Affiliations

Predictive modeling of schizophrenia from genomic data: Comparison of polygenic risk score with kernel support vector machines approach

Timothy Vivian-Griffiths et al. Am J Med Genet B Neuropsychiatr Genet. 2019 Jan.

Abstract

A major controversy in psychiatric genetics is whether nonadditive genetic interaction effects contribute to the risk of highly polygenic disorders. We applied a support vector machines (SVMs) approach, which is capable of building linear and nonlinear models using kernel methods, to classify cases from controls in a large schizophrenia case-control sample of 11,853 subjects (5,554 cases and 6,299 controls) and compared its prediction accuracy with the polygenic risk score (PRS) approach. We also investigated whether SVMs are a suitable approach to detecting nonlinear genetic effects, that is, interactions. We found that PRS provided more accurate case/control classification than either linear or nonlinear SVMs, and give a tentative explanation why PRS outperforms both multivariate regression and linear kernel SVMs. In addition, we observe that nonlinear kernel SVMs showed higher classification accuracy than linear SVMs when a large number of SNPs are entered into the model. We conclude that SVMs are a potential tool for assessing the presence of interactions, prior to searching for them explicitly.

Keywords: polygenic risk score; schizophrenia; support vector machines.

PubMed Disclaimer

Conflict of interest statement

The authors have no conflict of interest to declare.

Figures

Figure 1
Figure 1
Box plots of the distribution of prediction accuracy (AUC‐ROC score, y‐axis) of PRS and SVM algorithms in Batch 1 data and in the combined (Batch 1 + 2 data) using 125 GWAS significant SNPs. The box plot represents the distribution of data with the horizontal line being the median, the boundaries of the box are the first and third quartiles and the extremes are minimum and maximum values on the sample [Color figure can be viewed at wileyonlinelibrary.com]
Figure 2
Figure 2
The distribution of prediction accuracy of PRS and SVM models in Batch 1 data and in the combined (Batch 1 + 2) using 4,998 SNPs [Color figure can be viewed at wileyonlinelibrary.com]
Figure 3
Figure 3
Correlation coefficient r, for varying ORs and MAF in case/control sample for two independent SNPs. In all data (black) and cases (red) and controls (green) separately for MAF = 0.2 (solid) and MAF = 0.3 (dashed) [Color figure can be viewed at wileyonlinelibrary.com]
Figure 4
Figure 4
Comparison of case/control association p‐values when two SNPs are included as one predictor variable (PRS) in logistic regression (x‐axis) and separately (y‐axis)

Similar articles

Cited by

References

    1. Austin, P. C. , & Steyerberg, E. W. (2015). The number of subjects per variable required in linear regression analyses. Journal of Clinical Epidemiology, 68(6), 627–636. - PubMed
    1. Baker, E. , Schmidt, K. M. , Sims, R. , O'Donovan, M. C. , Williams, J. , Holmans, P. , … Consortium, W. T. G. (2018). POLARIS: Polygenic LD‐adjusted risk score approach for set‐based analysis of GWAS data. Genetic Epidemiology, 42(4), 366–377. - PMC - PubMed
    1. Ban, H. J. , Heo, J. Y. , Oh, K. S. , & Park, K. J. (2010). Identification of type 2 diabetes‐associated combination of SNPs using support vector machine. BMC Genetics, 11, 26. - PMC - PubMed
    1. Bridges, M. , Heron, E. A. , O'Dushlaine, C. , Segurado, R. , International Schizophrenia Consortium , Morris, D. , … Pinto, C. (2011). Genetic classification of populations using supervised learning. PLoS One, 6(5), e14802. - PMC - PubMed
    1. Chen, S. H. , Sun, J. , Dimitrov, L. , Turner, A. R. , Adams, T. S. , Meyers, D. A. , … Hsu, F. C. (2008). A support vector machine approach for detecting gene–gene interaction. Genetic Epidemiology, 32(2), 152–167. - PubMed

Publication types