Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 17;2(1):e000554.
doi: 10.1136/bmjmed-2023-000554. eCollection 2023.

Performance of polygenic risk scores in screening, prediction, and risk stratification: secondary analysis of data in the Polygenic Score Catalog

Affiliations

Performance of polygenic risk scores in screening, prediction, and risk stratification: secondary analysis of data in the Polygenic Score Catalog

Aroon D Hingorani et al. BMJ Med. .

Abstract

Objective: To clarify the performance of polygenic risk scores in population screening, individual risk prediction, and population risk stratification.

Design: Secondary analysis of data in the Polygenic Score Catalog.

Setting: Polygenic Score Catalog, April 2022. Secondary analysis of 3915 performance metric estimates for 926 polygenic risk scores for 310 diseases to generate estimates of performance in population screening, individual risk, and population risk stratification.

Participants: Individuals contributing to the published studies in the Polygenic Score Catalog.

Main outcome measures: Detection rate for a 5% false positive rate (DR5) and the population odds of becoming affected given a positive result; individual odds of becoming affected for a person with a particular polygenic score; and odds of becoming affected for groups of individuals in different portions of a polygenic risk score distribution. Coronary artery disease and breast cancer were used as illustrative examples.

Results: For performance in population screening, median DR5 for all polygenic risk scores and all diseases studied was 11% (interquartile range 8-18%). Median DR5 was 12% (9-19%) for polygenic risk scores for coronary artery disease and 10% (9-12%) for breast cancer. The population odds of becoming affected given a positive results were 1:8 for coronary artery disease and 1:21 for breast cancer, with background 10 year odds of 1:19 and 1:41, respectively, which are typical for these diseases at age 50. For individual risk prediction, the corresponding 10 year odds of becoming affected for individuals aged 50 with a polygenic risk score at the 2.5th, 25th, 75th, and 97.5th centiles were 1:54, 1:29, 1:15, and 1:8 for coronary artery disease and 1:91, 1:56, 1:34, and 1:21 for breast cancer. In terms of population risk stratification, at age 50, the risk of coronary artery disease was divided into five groups, with 10 year odds of 1:41 and 1:11 for the lowest and highest quintile groups, respectively. The 10 year odds was 1:7 for the upper 2.5% of the polygenic risk score distribution for coronary artery disease, a group that contributed 7% of cases. The corresponding estimates for breast cancer were 1:72 and 1:26 for the lowest and highest quintile groups, and 1:19 for the upper 2.5% of the distribution, which contributed 6% of cases.

Conclusion: Polygenic risk scores performed poorly in population screening, individual risk prediction, and population risk stratification. Strong claims about the effect of polygenic risk scores on healthcare seem to be disproportionate to their performance.

Keywords: preventive medicine; public health.

PubMed Disclaimer

Conflict of interest statement

Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/disclosure-of-interest/ and declare: support from the British Heart Foundation, University College London (UCL) National Institute for Health and Care Research (NIHR) Biomedical Research Centre, UK Research and Innovation (UKRI)/NIHR funded Multimorbidity Mechanism and Therapeutics Research Collaborative, and NIHR for the submitted work; ADH is a member of the advisory group for the Industrial Strategy Challenge Fund Accelerating Detection of Disease Challenge, and a co-opted member of the National Institute for Health and Care Excellence guideline update group for Cardiovascular disease: risk assessment and reduction, including lipid modification, CG181; ADH is a co-investigator on a grant from Pfizer to identify potential therapeutic targets for heart failure based on human genomics; NJW is a director of Polypill, a company that provides an online cardiovascular disease prevention service accessed on Polypill.com; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Figures

Figure 1
Figure 1
Derivation of metrics useful in assessing performance of polygenic risk scores in population screening, individual risk prediction, and population risk stratification. Difference in mean values for polygenic risk scores between affected and unaffected groups (and standard deviations) allows determination of overlap in polygenic risk score distributions between the two groups. Likelihood ratio in screening is detection rate for a specified false positive rate (5%) and is the ratio of the shaded areas in the top panel. In individual risk prediction, likelihood ratio is the ratio of the heights of the distributions at a specified polygenic risk score (middle panel). In population risk stratification, likelihood ratio is the ratio of areas under the distributions for a specified group of the population (fourth quintile group in bottom panel). Multiplication of the likelihood ratio by the background odds of disease in the population (1:9) allows calculation of the odds of becoming affected for each patient
Figure 2
Figure 2
Performance in screening estimated for polygenic risk scores included in the Polygenic Score Catalog from April 2022. Limits of each box represent interquartile range and horizontal line within each box is estimated detection rate for a 5% false positive rate (DR5) based on performance metrics reported for corresponding polygenic risk scores. Selected diseases are colour coded into categories cancers, cardiometabolic conditions, ocular diseases, allergic or autoimmune diseases, bone disease, and neuropsychiatric diseases. Horizontal line is estimated median DR5 value based on performance metrics for all 926 polygenic risk scores and all diseases studied in the Polygenic Score Catalog
Figure 3
Figure 3
Relative polygenic risk score distributions among those later affected or not by coronary artery disease and breast cancer. Mean value of polygenic risk score distribution in those later affected was shifted 0.48 standard deviation units to the right of the mean of the distribution for those who remained unaffected by coronary artery disease, and 0.37 standard deviation units to the right for breast cancer. Also shown are corresponding values for detection rate for a 5% false positive rate (DR5) and for odds ratios (rounded to the nearest whole number) for comparisons of top and bottom 1%, 5%, 10%, 20%, and 25% of unaffected polygenic risk score distribution
Figure 4
Figure 4
Likelihood ratios and 10 year odds of coronary artery disease and breast cancer for people aged 50 with a polygenic risk score result corresponding to 2.5th, 25th, 75th, and 97.5th centiles of the corresponding distribution
Figure 5
Figure 5
Likelihood ratios, odds, and number of affected and unaffected individuals for each quintile group in a hypothetical population of 100 000 individuals with a background 10 year odds of coronary artery disease of 1:19, and women with a 10 year odds of breast cancer of 1:41
Figure 6
Figure 6
Likelihood ratios and 10 year odds of coronary artery disease and breast cancer for people aged 50 comparing highest and lowest 2.5% of the unaffected polygenic risk score distributions
Figure 7
Figure 7
Estimated number of patients with breast cancer detected and missed, number of false positive results, and number of additional mammograms for a two stage screening test with a polygenic risk score (Polygenic Score Catalog identifier PGS000004) with a cut-off value at the unaffected 97.5th centile. Estimates are based on a hypothetical cohort of 100 000 women aged 40 with a background 10 year odds of breast cancer of 1:41. Performance of mammography in the detection of breast cancer uses estimates from the literature

References

    1. Wray NR, Lin T, Austin J, et al. From basic science to clinical application of polygenic risk scores: a primer. JAMA Psychiatry 2021;78:101–9. 10.1001/jamapsychiatry.2020.3049 - DOI - PubMed
    1. Mills MC, Rahal C. A scientometric review of genome-wide association studies. Commun Biol 2019;2:9. 10.1038/s42003-018-0261-x - DOI - PMC - PubMed
    1. Torkamani A, Wineinger NE, Topol EJ. The personal and clinical utility of polygenic risk scores. Nat Rev Genet 2018;19:581–90. 10.1038/s41576-018-0018-x - DOI - PubMed
    1. GOV.UK . Genome UK: the future of healthcare. Available: https://www.gov.uk/government/publications/genome-uk-the-future-of-healt... [Accessed 24 Jan 2022].
    1. Riveros-Mckay F, Weale ME, Moore R, et al. Integrated polygenic tool substantially enhances coronary artery disease prediction. Circ Genom Precis Med 2021;14:e003304. 10.1161/CIRCGEN.120.003304 - DOI - PMC - PubMed