Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 May 26;18(5):e0285991.
doi: 10.1371/journal.pone.0285991. eCollection 2023.

Assessing the potential of polygenic scores to strengthen medical risk prediction models of COVID-19

Affiliations

Assessing the potential of polygenic scores to strengthen medical risk prediction models of COVID-19

Aldo Córdova-Palomera et al. PLoS One. .

Erratum in

Abstract

As findings on the epidemiological and genetic risk factors for coronavirus disease-19 (COVID-19) continue to accrue, their joint power and significance for prospective clinical applications remains virtually unexplored. Severity of symptoms in individuals affected by COVID-19 spans a broad spectrum, reflective of heterogeneous host susceptibilities across the population. Here, we assessed the utility of epidemiological risk factors to predict disease severity prospectively, and interrogated genetic information (polygenic scores) to evaluate whether they can provide further insights into symptom heterogeneity. A standard model was trained to predict severe COVID-19 based on principal component analysis and logistic regression based on information from eight known medical risk factors for COVID-19 measured before 2018. In UK Biobank participants of European ancestry, the model achieved a relatively high performance (area under the receiver operating characteristic curve ~90%). Polygenic scores for COVID-19 computed from summary statistics of the Covid19 Host Genetics Initiative displayed significant associations with COVID-19 in the UK Biobank (p-values as low as 3.96e-9, all with R2 under 1%), but were unable to robustly improve predictive performance of the non-genetic factors. However, error analysis of the non-genetic models suggested that affected individuals misclassified by the medical risk factors (predicted low risk but actual high risk) display a small but consistent increase in polygenic scores. Overall, the results indicate that simple models based on health-related epidemiological factors measured years before COVID-19 onset can achieve high predictive power. Associations between COVID-19 and genetic factors were statistically robust, but currently they have limited predictive power for translational settings. Despite that, the outcomes also suggest that severely affected cases with a medical history profile of low risk might be partly explained by polygenic factors, prompting development of boosted COVID-19 polygenic models based on new data and tools to aid risk-prediction.

PubMed Disclaimer

Conflict of interest statement

ACP, CS, CD, EW, DD and SS are employees of Takeda Development Center Americas, Inc. CS, CD, DD and SS own stock/stock options in Takeda. DD is shareholder of Merck. SS is shareholder of J&J. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. Receiver-operating characteristic curve and confusion matrices for the COVID-19 case-control classifiers using logistic regression.
Fig 2
Fig 2. Summary of R2 and p-values for the 56 sets of PS analyses.
“Base phenotype” refers to the case-control definition (e.g., A1) and ethnicity (e.g., European or multi-ancestry) used to compute the external GWAS summary statistics (excluding the UK Biobank cohort), whereas “targetpheno” indicates the case-control definition used for the current analysis of European-ancestry individuals in the UK Biobank.
Fig 3
Fig 3. Linear regression results for the associations between 56 top polygenic scores and disease risk prediction based on medical comorbidities, among cases (not controls).
Fig 4
Fig 4. Correlations matrix of model predictions (case probabilities for A1, lenient A1, A2, B1, B2, C1 and C2 (yellow side bars)) and candidate polygenic scores (green side bars).
For this matrix, predicted case probabilities were included only for true cases, whereas all available polygenic scores were used.

References

    1. Cascella M., et al., Features, Evaluation, and Treatment of Coronavirus (COVID-19), in StatPearls. 2021: Treasure Island (FL). - PubMed
    1. Vaughan L., et al., Relationship of socio-demographics, comorbidities, symptoms and healthcare access with early COVID-19 presentation and disease severity. BMC Infect Dis, 2021. 21(1): p. 40. doi: 10.1186/s12879-021-05764-x - DOI - PMC - PubMed
    1. Sosa-Rubi S.G., et al., Incremental Risk of Developing Severe COVID-19 Among Mexican Patients With Diabetes Attributed to Social and Health Care Access Disadvantages. Diabetes Care, 2021. 44(2): p. 373–380. doi: 10.2337/dc20-2192 - DOI - PMC - PubMed
    1. Emami A., et al., Prevalence of Underlying Diseases in Hospitalized Patients with COVID-19: a Systematic Review and Meta-Analysis. Arch Acad Emerg Med, 2020. 8(1): p. e35. - PMC - PubMed
    1. Centers for Disease Control and Prevention. Science Brief: Evidence used to update the list of underlying medical conditions that increase a person’s risk of severe illness from COVID-19. 2021 May 12, 2021 [cited 2021 August 11, 2021]; https://www.cdc.gov/coronavirus/2019-ncov/science/science-briefs/underly.... - PubMed

Publication types