Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Jul 15;86(2):97-109.
doi: 10.1016/j.biopsych.2018.12.015. Epub 2018 Dec 28.

Predicting Polygenic Risk of Psychiatric Disorders

Affiliations
Review

Predicting Polygenic Risk of Psychiatric Disorders

Alicia R Martin et al. Biol Psychiatry. .

Abstract

Genetics provides two major opportunities for understanding human disease-as a transformative line of etiological inquiry and as a biomarker for heritable diseases. In psychiatry, biomarkers are very much needed for both research and treatment, given the heterogenous populations identified by current phenomenologically based diagnostic systems. To date, however, useful and valid biomarkers have been scant owing to the inaccessibility and complexity of human brain tissue and consequent lack of insight into disease mechanisms. Genetic biomarkers are therefore especially promising for psychiatric disorders. Genome-wide association studies of common diseases have matured over the last decade, generating the knowledge base for increasingly informative individual-level genetic risk prediction. In this review, we discuss fundamental concepts involved in computing genetic risk with current methods, strengths and weaknesses of various approaches, assessments of utility, and applications to various psychiatric disorders and related traits. Although genetic risk prediction has become increasingly straightforward to apply and common in published studies, there are important pitfalls to avoid. At present, the clinical utility of genetic risk prediction is still low; however, there is significant promise for future clinical applications as the ancestral diversity and sample sizes of genome-wide association studies increase. We discuss emerging data and methods aimed at improving the value of genetic risk prediction for disentangling disease mechanisms and stratifying subjects for epidemiological and clinical studies. For all applications, it is absolutely critical that polygenic risk prediction is applied with appropriate methodology and control for confounding to avoid repeating some mistakes of the candidate gene era.

Keywords: Complex traits; Heritability; Liability threshold model; Polygenic risk scores; Population genetics; Precision medicine; Psychiatric disorders; Psychiatric genetics; Statistical genetics.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Proportion of DNA shared influences risk of heritable disease.
Relationship with schizophrenia patients predicts lifetime risk of schizophrenia in family members (adapted from Gottesman, 1991).
Figure 2.
Figure 2.. Normal genetic risk in a population with an additive genetic architecture
A) Definition and illustration of polygenic risk score calculation. Using a set of existing GWAS summary statistics, the polygenic risk score is computed in a target cohort as Y=j=1mgjβj, where j is a SNP in m independent SNPs associated with the phenotype of interest, g is the number of trait-increasing alleles for a particular SNP, and β is the corresponding GWAS effect size estimate. An LD clump is an associated locus with one or few causal loci but a linkage peak of associated variants due to LD correlation in the region. The signal-to-noise ratio can be tuned to maximize prediction accuracy in a target cohort by modifying the maximum p-value threshold for SNP inclusion. B) Large numbers of SNPs contributing to complex traits can be modeled accurately with genetic liability as a normal distribution. Here, we demonstrate this by showing the genetic risk distribution for increasing numbers of SNPs with an allele frequency of 0.5 (although normality is expected regardless of allele frequency when larger numbers of SNPs are causal). The best-powered GWAS of complex traits such as height, schizophrenia, and educational attainment have identified hundreds to thousands of independent, genome-wide significant loci. This phenomenon can be explained by the central limit theorem as demonstrated previously (107). C) Additive GWAS regression models tend to work well for genetic associations across a range of allele frequencies, even in the presence of dominance. D) Previous work in the UK Biobank has demonstrated that across 25 complex traits and diseases, most of the heritable variation in complex traits can be explained by common variants (e.g., ≥ 5% allele frequency). While the exact proportion can vary, the curve illustrates h2 = 2*p*(1-p)α, where α = −0.38 ± 0.02 across complex traits (108). E) A previous GWAS of schizophrenia identified 128 independent genome-wide significant loci, shown here (11). These loci illustrate a relationship between frequency and corresponding odds ratios, in which lower frequency variants can have larger effect sizes, reflecting the impact of natural selection on genetic architecture.
Figure 3.
Figure 3.. Correlated epidemiological and genetic factors can be causally dissected with Mendelian randomization.
GRS = genetic risk scores from independent genome-wide significant SNPs. BMI = body mass index, LDL-C = LDL cholesterol, HDL-C = HDL cholesterol, CHD = coronary heart disease. A) Epidemiological factors from FINRISK and their associations with risk of CHD after all evaluated factors have been normalized, and age and sex have been regressed out. Test statistics for each of the panel comparisons are written in plot corners and are as follows (t-tests for the top 3 panels, ANOVA for the bottom 3 panels): BMI: p=3.5e-18; LDL: p=0.79; HDL: p=5.3e-62; LDL and BMI: p=2.7e-73, HDL and BMI: p<1e-100, HDL and LDL: p=2.0e-7. B) Genetic factors associated with LDL-C, HDL-C, and BMI enable causal inference for CHD. Whereas genetic risk of increased LDL-C and BMI are causally associated with increased risk of CHD, HDL-C is genetically anti-correlated with CHD but is not causal. Genetic correlations (ρg) are from LD Hub (12).
Figure 4.
Figure 4.. Predictive accuracy of polygenic risk scores for height at intervals along the measured height distribution in the UK Biobank.
Using summary statistics from the GIANT Consortium, we computed polygenic risk scores for height and compared them to the distribution of standardized height in the UK Biobank after adjusting for sex and the first 10 principal components. However, prediction accuracy is not distributed evenly; it performs particularly poorly at the extreme short end of the height distribution, indicating a larger contribution of environmental factors, large-effect rare variants, and/or other factors in these individuals. Numbers in the plot indicate observed versus expected polygenic risk scores within corresponding breakpoints along the adjusted height distribution. Expected polygenic risk scores comes from multivariate normal simulations assuming the same correlation between adjusted height and observed polygenic risk scores.
Figure 5.
Figure 5.. Genetic correlation between psychiatric disorders and cognitive/behavioral phenotypes.
Measures are from LD Hub (12). ASD = autism spectrum disorders, BIP = bipolar disorder, SCZ = schizophrenia. Legend indicates genetic correlation, ρg.

References

    1. Fisher RA (1918): XV.—The Correlation between Relatives on the Supposition of Mendelian Inheritance. Transactions of the Royal Society of Edinburgh. 52: 399–433.
    1. Pearson K, Lee A (1900): Mathematical Contributions to the Theory of Evolution. VIII. On the Inheritance of Characters not Capable of Exact Quantitative Measurement. Part I. Introductory. Part II. On the Inheritance of Coat-Colour in Horses. Part III. On the Inheritance of Eye-Colour in Man. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 195: 79–150.
    1. Wright S (1934): An Analysis of Variability in Number of Digits in an Inbred Strain of Guinea Pigs. Genetics. 19: 506–536. - PMC - PubMed
    1. Falconer DS (1965): The inheritance of liability to certain diseases, estimated from the incidence among relatives. Annals of Human Genetics, 2nd ed. 29: 51–76.
    1. Gottesman II, Shields J (1967): A polygenic theory of schizophrenia. Proc Natl Acad Sci USA. 58: 199–205. - PMC - PubMed

Publication types