Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb;30(2):480-487.
doi: 10.1038/s41591-024-02796-z. Epub 2024 Feb 19.

Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations

Niall J Lennon #  1 Leah C Kottyan #  2 Christopher Kachulis  3 Noura S Abul-Husn  4 Josh Arias  5 Gillian Belbin  4 Jennifer E Below  6 Sonja I Berndt  5 Wendy K Chung  7 James J Cimino  8 Ellen Wright Clayton  6 John J Connolly  9 David R Crosslin  10   11 Ozan Dikilitas  12 Digna R Velez Edwards  6 QiPing Feng  6 Marissa Fisher  3 Robert R Freimuth  12 Tian Ge  13 GIANT ConsortiumAll of Us Research ProgramJoseph T Glessner  9 Adam S Gordon  14 Candace Patterson  3 Hakon Hakonarson  9 Maegan Harden  3 Margaret Harr  9 Joel N Hirschhorn  3   15 Clive Hoggart  4 Li Hsu  16 Marguerite R Irvin  8 Gail P Jarvik  11 Elizabeth W Karlson  13 Atlas Khan  7 Amit Khera  3 Krzysztof Kiryluk  7 Iftikhar Kullo  12 Katie Larkin  3 Nita Limdi  8 Jodell E Linder  6 Ruth J F Loos  17   18 Yuan Luo  14 Edyta Malolepsza  3 Teri A Manolio  5 Lisa J Martin  2 Li McCarthy  3 Elizabeth M McNally  14 James B Meigs  13 Tesfaye B Mersha  2 Jonathan D Mosley  6 Anjene Musick  19 Bahram Namjou  2 Nihal Pai  3 Lorenzo L Pesce  14 Ulrike Peters  16 Josh F Peterson  6 Cynthia A Prows  2 Megan J Puckelwartz  14 Heidi L Rehm  3 Dan M Roden  6 Elisabeth A Rosenthal  11 Robb Rowley  5 Konrad Teodor Sawicki  14 Daniel J Schaid  12 Roelof A J Smit  4 Johanna L Smith  12 Jordan W Smoller  13 Minta Thomas  16 Hemant Tiwari  8 Diana M Toledo  3 Nataraja Sarma Vaitinadin  6 David Veenstra  11 Theresa L Walunas  14 Zhe Wang  4 Wei-Qi Wei  6 Chunhua Weng  7 Georgia L Wiesner  6 Xianyong Yin  20 Eimear E Kenny  4
Collaborators, Affiliations

Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations

Niall J Lennon et al. Nat Med. 2024 Feb.

Abstract

Polygenic risk scores (PRSs) have improved in predictive performance, but several challenges remain to be addressed before PRSs can be implemented in the clinic, including reduced predictive performance of PRSs in diverse populations, and the interpretation and communication of genetic results to both providers and patients. To address these challenges, the National Human Genome Research Institute-funded Electronic Medical Records and Genomics (eMERGE) Network has developed a framework and pipeline for return of a PRS-based genome-informed risk assessment to 25,000 diverse adults and children as part of a clinical study. From an initial list of 23 conditions, ten were selected for implementation based on PRS performance, medical actionability and potential clinical utility, including cardiometabolic diseases and cancer. Standardized metrics were considered in the selection process, with additional consideration given to strength of evidence in African and Hispanic populations. We then developed a pipeline for clinical PRS implementation (score transfer to a clinical laboratory, validation and verification of score performance), and used genetic ancestry to calibrate PRS mean and variance, utilizing genetically diverse data from 13,475 participants of the All of Us Research Program cohort to train and test model parameters. Finally, we created a framework for regulatory compliance and developed a PRS clinical report for return to providers and for inclusion in an additional genome-informed risk assessment. The initial experience from eMERGE can inform the approach needed to implement PRS-based testing in diverse clinical settings.

PubMed Disclaimer

Conflict of interest statement

N.S.A.-H. is an employee and equity holder of 23andMe; serves as a scientific advisory board member for Allelica, Inc; received personal fees from Genentech Inc, Allelica Inc, and 23andMe; received research funding from Akcea Therapeutics; and was previously employed by Regeneron Pharmaceuticals. E.E.K. received personal fees from Illumina Inc, 23andMe and Regeneron Pharmaceuticals and serves as a scientific advisory board member for Encompass Bioscience, Foresite Labs and Galateo Bio. J.N.H. has equity in Camp4 Therapeutics and has been a consultant to Amgen, AstraZeneca, Cytokinetics, PepGen, Pfizer and Tenaya Therapeutics and is the founder of Ikaika Therapeutics. J.F.P. is a paid consultant for Natera Inc. A. Khera. is an employee of Verve Therapeutics. N.L. received personal fees from Illumina Inc and is a scientific advisory board member for FYR Diagnostics. J.F.P. is a consultant for Myome. D.V. is a consultant for Illumina and has grant support from GeneDx. T.L.W. has grant funding from Gilead Sciences, Inc. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Timeline and process overview.
a,Timeline and process for selection, evaluation, optimization, transfer, validation and implementation of the clinical PRS test pipeline. Dashed lines represent pivotal moments in the progression of the project with duration between these events indicated in months (mo) above the blue arrow. Numbers in white represent the number of conditions being examined at each stage and their fates. List of ten conditions on the right-hand side indicates the conditions that were implemented in the clinical pipeline for this study. b, Overview of the eMERGE PRS process. Participant DNA is genotyped using the Illumina Global Diversity Array, which assesses 1.8 million sites. Genotyping data are phased and imputed with a reference panel derived from the 1,000 Genomes Project. For each participant, raw PRSs are calculated for each condition (PRSraw). Each participant’s genetic ancestry is algorithmically determined in the projection step. For each condition, an ancestry calibration model is applied to each participant’s z-scores based on model parameters derived from the All of Us Research Program (Calibration) and an adjusted z-score is calculated (PRSadjusted). Participants whose adjusted scores cross the predefined threshold for high PRS are identified and a pdf report is generated. The report is electronically signed after data review by a clinical laboratory director and delivered to the study portal for return to the clinical sites.
Fig. 2
Fig. 2. Summary of the ten conditions that were implemented.
‘High-PRS threshold’ represents the percentile that is deemed to be the cutoff for a specific condition above which a high-PRS result is reported for that condition. Odds ratios are reported as the mean odds ratios (square dot) associated with having a score above the specified threshold, compared to having a score below the specified threshold, along with 95% confidence intervals (CIs), shown in the whiskers. The number of case and control samples used to derive these odds ratios and CIs for each condition can be found in Supplementary Table 2. Note that the odds ratio for obesity is not reported here, as it will be published by the Genetic Investigation of ANthropometric Traits consortium (Smit et al., manuscript in preparation). ‘Number of SNPs’ represents the range of numbers or sites included in each score. ‘Age ranges for return’ indicates the participant ages at which a PRS is calculated for a given condition. AFIB, atrial fibrillation; BC, breast cancer; CKD, chronic kidney disease; CHD, coronary heart disease; HC, hypercholesterolemia; PC, prostate cancer; T1D, type 1 diabetes; T2D, type 2 diabetes.
Fig. 3
Fig. 3. Summary of the first 2,500 eMERGE participants processed through the clinical pipeline.
a, PCA of ancestry indicating participants with a result of ‘high PRS’ for any condition (red dots) compared to participants who did not have a high PRS identified (gray dots). b, Summary of number of high-risk conditions found per participant. c, Observed numbers of high PRS called per condition compared to the expected numbers of high PRS per condition. P values are two-sided P values calculated by simulation to account for the uncertainty in the All of Us (AoU) derived ancestry calibration parameters due to the finite size of the AoU training cohort, and further adjusted for multiple hypothesis testing using the Holm–Šidák procedure. Note not all participants get scored for every condition based on age and sex at birth filters.
Extended Data Fig. 1
Extended Data Fig. 1. Case-control PRS histograms.
Histograms of T2D PRS scores for case and control samples in the eMERGE I-III dataset.
Extended Data Fig. 2
Extended Data Fig. 2. Representation of the genetic ancestry admixture composition of both the Test and Training cohorts.
The x-axis represents individuals within the cohorts and the color-coding highlights the proportion of genetic admixture observed.
Extended Data Fig. 3
Extended Data Fig. 3. Calibrated z-scores.
The distributions of calibrated z-scores in the test cohort when the training cohort parameters are applied.
Extended Data Fig. 4
Extended Data Fig. 4. Hypercholesterolemia PRS calibrated z-scores of training cohort.
Note the improvement when an ancestry dependent variance is used over a constant variance method.
Extended Data Fig. 5
Extended Data Fig. 5. PRS z-score as a function of African Admixture Fraction, for individuals of African ancestry.
In the ‘Bucketing’ method, a z-score is calculated by comparing to the mean and variance of all individuals of African ancestry in the cohort. The ‘PCA Calibrated’ method is the method described above. Note the dependence on admixture fraction in the ‘Bucketing’ method, which has been removed in the ‘PCA Calibrated’ method.

Update of

  • Selection, optimization, and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse populations.
    Lennon NJ, Kottyan LC, Kachulis C, Abul-Husn N, Arias J, Belbin G, Below JE, Berndt S, Chung W, Cimino JJ, Clayton EW, Connolly JJ, Crosslin D, Dikilitas O, Velez Edwards DR, Feng Q, Fisher M, Freimuth R, Ge T; GIANT Consortium; All of Us Research Program; Glessner JT, Gordon A, Guiducci C, Hakonarson H, Harden M, Harr M, Hirschhorn J, Hoggart C, Hsu L, Irvin R, Jarvik GP, Karlson EW, Khan A, Khera A, Kiryluk K, Kullo I, Larkin K, Limdi N, Linder JE, Loos R, Luo Y, Malolepsza E, Manolio T, Martin LJ, McCarthy L, Meigs JB, Mersha TB, Mosley J, Namjou B, Pai N, Pesce LL, Peters U, Peterson J, Prows CA, Puckelwartz MJ, Rehm H, Roden D, Rosenthal EA, Rowley R, Sawicki KT, Schaid D, Schmidlen T, Smit R, Smith J, Smoller JW, Thomas M, Tiwari H, Toledo D, Vaitinadin NS, Veenstra D, Walunas T, Wang Z, Wei WQ, Weng C, Wiesner G, Xianyong Y, Kenny E. Lennon NJ, et al. medRxiv [Preprint]. 2023 Jun 5:2023.05.25.23290535. doi: 10.1101/2023.05.25.23290535. medRxiv. 2023. Update in: Nat Med. 2024 Feb;30(2):480-487. doi: 10.1038/s41591-024-02796-z. PMID: 37333246 Free PMC article. Updated. Preprint.

References

    1. Lambert SA, et al. The Polygenic Score Catalog as an open database for reproducibility and systematic evaluation. Nat. Genet. 2021;53:420–425. doi: 10.1038/s41588-021-00783-5. - DOI - PMC - PubMed
    1. Lewis CM, Vassos E. Polygenic risk scores: from research tools to clinical instruments. Genome Med. 2020;12:44. doi: 10.1186/s13073-020-00742-5. - DOI - PMC - PubMed
    1. Polygenic Risk Score Task Force of the International Common Disease Alliance. Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. Nat. Med. 2021;27:1876–1884. doi: 10.1038/s41591-021-01549-6. - DOI - PubMed
    1. Torkamani A, Wineinger NE, Topol EJ. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 2018;19:581–590. doi: 10.1038/s41576-018-0018-x. - DOI - PubMed
    1. Duncan L, et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 2019;10:3328. doi: 10.1038/s41467-019-11112-0. - DOI - PMC - PubMed