. 2020 Aug 6;10(1):13190.

doi: 10.1038/s41598-020-69927-7.

Sibling validation of polygenic risk scores and complex trait prediction

Louis Lello^{1

2}, Timothy G Raben³, Stephen D H Hsu^{3

4}

Affiliations

¹ Department of Physics and Astronomy, Michigan State University, East Lansing, USA. lellolou@msu.edu.
² Genomic Prediction, Inc., North Brunswick, NJ, USA. lellolou@msu.edu.
³ Department of Physics and Astronomy, Michigan State University, East Lansing, USA.
⁴ Genomic Prediction, Inc., North Brunswick, NJ, USA.

PMID: 32764582
PMCID: PMC7411027
DOI: 10.1038/s41598-020-69927-7

Sibling validation of polygenic risk scores and complex trait prediction

Louis Lello et al. Sci Rep. 2020.

. 2020 Aug 6;10(1):13190.

doi: 10.1038/s41598-020-69927-7.

Authors

Louis Lello^{1

2}, Timothy G Raben³, Stephen D H Hsu^{3

4}

Affiliations

¹ Department of Physics and Astronomy, Michigan State University, East Lansing, USA. lellolou@msu.edu.
² Genomic Prediction, Inc., North Brunswick, NJ, USA. lellolou@msu.edu.
³ Department of Physics and Astronomy, Michigan State University, East Lansing, USA.
⁴ Genomic Prediction, Inc., North Brunswick, NJ, USA.

PMID: 32764582
PMCID: PMC7411027
DOI: 10.1038/s41598-020-69927-7

Abstract

We test 26 polygenic predictors using tens of thousands of genetic siblings from the UK Biobank (UKB), for whom we have SNP genotypes, health status, and phenotype information in late adulthood. Siblings have typically experienced similar environments during childhood, and exhibit negligible population stratification relative to each other. Therefore, the ability to predict differences in disease risk or complex trait values between siblings is a strong test of genomic prediction in humans. We compare validation results obtained using non-sibling subjects to those obtained among siblings and find that typically most of the predictive power persists in between-sibling designs. In the case of disease risk we test the extent to which higher polygenic risk score (PRS) identifies the affected sibling, and also compute Relative Risk Reduction as a function of risk score threshold. For quantitative traits we examine between-sibling differences in trait values as a function of predicted differences, and compare to performance in non-sibling pairs. Example results: Given 1 sibling with normal-range PRS score (< 84 percentile, < + 1 SD) and 1 sibling with high PRS score (top few percentiles, i.e. > + 2 SD), the predictors identify the affected sibling about 70-90% of the time across a variety of disease conditions, including Breast Cancer, Heart Attack, Diabetes, etc. 55-65% of the time the higher PRS sibling is the case. For quantitative traits such as height, the predictor correctly identifies the taller sibling roughly 80 percent of the time when the (male) height difference is 2 inches or more.

PubMed Disclaimer

Conflict of interest statement

Stephen Hsu a shareholder of Genomic Prediction, Inc. (GP), and serves on its Board of Directors. Louis Lello is an employee and shareholder of GP. Tim Raben has no commercial interests relevant to the research.

Figures

**Figure 1**
The left and right panels show case and control distributions in PRS for the entire cohort of sibling pairs and the Affected Sibling Pair (ASP) cohort respectively. Phenotype is Hypertension. This plot was made using pyplot v3.2.1 under license https://matplotlib.org/3.2.1/users/license.html.

**Figure 2**
Predictors tested on random (non-sibling) pairs and affected sibling pairs with a single case. One individual is high risk (with z-score given on the horizontal axis) and the other is normal risk (PRS < + 1 SD). The error estimates are explained in the text. This plot was made using pyplot v3.2.1 under license https://matplotlib.org/3.2.1/users/license.html.

**Figure 3**
Exclusion of individuals above (left panel) and below (right panel) a z-score threshold (horizontal axis) with resulting group prevalence shown on the vertical axis. The left panel shows risk reduction in a low PRS population, the right panel shows risk enhancement in a high PRS population. Top figures are results in the general population, bottom figures are the Affected Sibling Pair (ASP) population (i.e., variation of risk with PRS among individuals with an affected sib). Phenotype is Type 2 Diabetes. This plot was made using pyplot v3.2.1 under license https://matplotlib.org/3.2.1/users/license.html.

**Figure 4**
Exclusion of individuals above (left panel) and below (right panel) a z-score threshold (horizontal axis) with resulting group prevalence shown on the vertical axis. The left panel shows risk reduction in a low PRS population, the right panel shows risk enhancement in a high PRS population. Top figures are results in the general population, bottom figures are the Affected Sibling Pair (ASP) population (i.e., variation of risk with PRS among individuals with an affected sib). Phenotype is Breast Cancer. This plot was made using pyplot v3.2.1 under license https://matplotlib.org/3.2.1/users/license.html.

**Figure 5**
Exclusion of individuals above (left panel) and below (right panel) a z-score threshold (horizontal axis) with resulting group prevalence shown on the vertical axis. The left panel shows risk reduction in a low PRS population, the right panel shows risk enhancement in a high PRS population. Top figures are results in the general population, bottom figures are the Affected Sibling Pair (ASP) population (i.e., variation of risk with PRS among individuals with an affected sib). Phenotype is Hypertension. This plot was made using pyplot v3.2.1 under license https://matplotlib.org/3.2.1/users/license.html.

**Figure 6**
Exclusion of individuals above (left panel) and below (right panel) a z-score threshold (horizontal axis) with resulting group prevalence shown on the vertical axis. The left panel shows risk reduction in a low PRS population, the right panel shows risk enhancement in a high PRS population. Top figures are results in the general population, bottom figures are the Affected Sibling Pair (ASP) population (i.e., variation of risk with PRS among individuals with an affected sib). Phenotype is Heart Attack.This plot was made using pyplot v3.2.1 under license https://matplotlib.org/3.2.1/users/license.html.

**Figure 7**
Difference in phenotype (vertical axis) and difference in polygenic score (horizontal axis) for pairs of individuals. Red dots are sibling pairs and blue dots are random (non-sibling) pairs. This plot was made using pyplot v3.2.1 under license https://matplotlib.org/3.2.1/users/license.html.

**Figure 8**
Probability of PGS correctly identifying the individual with larger phenotype value (vertical axis). Horizontal axis shows absolute difference in phenotypes. The blue line is for sibling pairs, the orange line is for randomized (non-sibling) pairs. This plot was made using pyplot v3.2.1 under license https://matplotlib.org/3.2.1/users/license.html.

See this image and copyright information in PMC

References

1. Polderman TJ, et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 2015;47:702. doi: 10.1038/ng.3285. - DOI - PubMed
1. Boomsma D, Busjahn A, Peltonen L. Classical twin studies and beyond. Nat. Rev. Genet. 2002;3:872–882. doi: 10.1038/nrg932. - DOI - PubMed
1. Jelenkovic A, et al. Genetic and environmental influences on height from infancy to early adulthood: An individual-based pooled analysis of 45 twin cohorts. Sci. Rep. 2016;6:1–13. doi: 10.1038/srep28496. - DOI - PMC - PubMed
1. Felson J. What can we learn from twin studies? A comprehensive evaluation of the equal environments assumption. Soc. Sci. Res. 2014;43:184–199. doi: 10.1016/j.ssresearch.2013.10.004. - DOI - PubMed
1. Lello L, Raben TG, Yong SY, Tellier LC, Hsu SDH. Genomic prediction of 16 complex disease risks including heart attack, diabetes, breast and prostate cancer. Sci. Rep. 2019;9:1–16. doi: 10.1038/s41598-018-37186-2. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Sibling validation of polygenic risk scores and complex trait prediction

Affiliations

Sibling validation of polygenic risk scores and complex trait prediction

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources