. 2024 Oct;14(10):e70042.

doi: 10.1002/ctm2.70042.

Prediction of COVID-19 severity using machine learning

Kanita Karaduzovic-Hadziabdic¹, Muhamed Adilovic¹, Lu Zhang², Andrew I Lumley³, Pranay Shah⁴, Muhammad Shoaib⁵, Venkata Satagopam⁵, Prashant Kumar Srivastava⁶, Costanza Emanueli⁶, Simona Greco⁷, Alisia Madè⁷, Teresa Padro⁸, Pedro Domingo⁸, Mitja Lustrek⁹, Markus Scholz¹⁰, Maciej Rosolowski¹⁰, Marko Jordan⁹, Bettina Benczik^{11

12

13}, Bence Ágg^{11

12

13}, Péter Ferdinandy^{11

12

13}, Andrew H Baker^{14

15}, Guy Fagherazzi¹⁶, Markus Ollert¹⁷, Joanna Michel¹⁸, Gabriel Sanchez¹⁸, Hüseyin Firat¹⁸, Timo Brandenburger¹⁹, Fabio Martelli⁷, Lina Badimon⁸, Yvan Devaux³; COVIRNA consortium (www.covirna.eu)

Affiliations

¹ Faculty of Engineering and Natural Sciences, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina.
² Bioinformatics Platform, Luxembourg Institute of Health, Strassen, Luxembourg.
³ Cardiovascular Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.
⁴ Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK.
⁵ Luxembourg Center for Systems Biomedicine, University of Luxembourg, Belval, Luxembourg.
⁶ National Heart and Lung Institute, Imperial College London, London, UK.
⁷ Molecular Cardiology Laboratory, IRCCS Policlinico San Donato, Milan, Italy.
⁸ Cardiovascular Program-ICCC, Institut d'Investigació Biomèdica Sant Pau (IIB SANT PAU); CIBERCV; Autonomous University of Barcelona, Barcelona, Spain.
⁹ Department of Intelligent Systems, Jozef Stefan Institute, Ljubljana, Slovenia.
¹⁰ Group Genetical Statistics and Biomathematical Modelling, Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany.
¹¹ Cardiometabolic and HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary.
¹² Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary.
¹³ Pharmahungary Group, Szeged, Hungary.
¹⁴ Centre for Cardiovascular Science, The Queen's Medical Research Institute, University of Edinburgh, Edinburgh, Scotland, UK.
¹⁵ Department of Pathology, CARIM, Maastricht University, Maastricht, The Netherlands.
¹⁶ Deep Digital Phenotyping Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.
¹⁷ Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette, Luxembourg.
¹⁸ Firalis SA, Huningue, France.
¹⁹ Department of Anesthesiology, University Hospital Düsseldorf, Heinrich-Heine University Duesseldorf, Moorenstr, Germany.

PMID: 39370709
PMCID: PMC11456675
DOI: 10.1002/ctm2.70042

Prediction of COVID-19 severity using machine learning

Kanita Karaduzovic-Hadziabdic et al. Clin Transl Med. 2024 Oct.

. 2024 Oct;14(10):e70042.

doi: 10.1002/ctm2.70042.

Authors

Affiliations

¹ Faculty of Engineering and Natural Sciences, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina.
² Bioinformatics Platform, Luxembourg Institute of Health, Strassen, Luxembourg.
³ Cardiovascular Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.
⁴ Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK.
⁵ Luxembourg Center for Systems Biomedicine, University of Luxembourg, Belval, Luxembourg.
⁶ National Heart and Lung Institute, Imperial College London, London, UK.
⁷ Molecular Cardiology Laboratory, IRCCS Policlinico San Donato, Milan, Italy.
⁸ Cardiovascular Program-ICCC, Institut d'Investigació Biomèdica Sant Pau (IIB SANT PAU); CIBERCV; Autonomous University of Barcelona, Barcelona, Spain.
⁹ Department of Intelligent Systems, Jozef Stefan Institute, Ljubljana, Slovenia.
¹⁰ Group Genetical Statistics and Biomathematical Modelling, Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany.
¹¹ Cardiometabolic and HUN-REN-SU System Pharmacology Research Group, Department of Pharmacology and Pharmacotherapy, Semmelweis University, Budapest, Hungary.
¹² Center for Pharmacology and Drug Research & Development, Semmelweis University, Budapest, Hungary.
¹³ Pharmahungary Group, Szeged, Hungary.
¹⁴ Centre for Cardiovascular Science, The Queen's Medical Research Institute, University of Edinburgh, Edinburgh, Scotland, UK.
¹⁵ Department of Pathology, CARIM, Maastricht University, Maastricht, The Netherlands.
¹⁶ Deep Digital Phenotyping Research Unit, Department of Precision Health, Luxembourg Institute of Health, Strassen, Luxembourg.
¹⁷ Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette, Luxembourg.
¹⁸ Firalis SA, Huningue, France.
¹⁹ Department of Anesthesiology, University Hospital Düsseldorf, Heinrich-Heine University Duesseldorf, Moorenstr, Germany.

PMID: 39370709
PMCID: PMC11456675
DOI: 10.1002/ctm2.70042

No abstract available

PubMed Disclaimer

Conflict of interest statement

YD holds patents and licensing agreements related to the use of RNAs for diagnostic and therapeutic purposes (WO2018229046, licensed to Firalis SA, protecting the use of lncRNAs in the FIMICS panel used for RNAseq in the present paper; other patents and licenses are not related to the present work). YD is Scientific Advisory Board member of Firalis SA.

PF is the founder and CEO of Pharmahungary Group, a group of R&D companies.

LB declares to have acted as a SAB member of Sanofi, Ionnis, MSD and NovoNordisk; to have received speaker fees from Sanofi, Bayer and AB‐Biotics SA and to have founded the spin‐off Ivastatin Therapeutics S.L. (all unrelated to this work).

TP declares to have received speaker fees from AB‐Biotics SA and to be a co‐founder of the Spin‐off Ivastatin Therapeutics SL (all unrelated to this work).

MS received funding from Pfizer Inc. and from Owkin for projects not related to this research.

HF is the founder and owner of Firalis SA, a company commercialising the FIMICS panel. He holds patents and licenses for the use of RNAs as biomarkers and therapeutic targets.

All other authors declare no competing interests.

Figures

**FIGURE 1**
Study workflow and data available for the analysis (A) Study workflow. Blood samples stored at −80°C in a central NF S96‐900 certified Biobank at Firalis SA were collected from 564 patients with COVID‐19. Following this, RNA extraction, quality check, library preparation, and analysis by targeted sequencing using the FIMICS panel were performed. RNA seq data was then merged with patients’ clinical data and stored in a central database. Data was curated and made available for analysis using ML. (B) Baseline datasets available for analysis from four European cohorts: PrediCOVID from Luxembourg (n = 162), MiRCOVID from Germany (n = 69), COVID19_OMICS‐COVIRNA from Italy (n = 100), and TOCOVID from Spain (n = 233). Patient numbers indicated for each cohort after data curation and preprocessing: PrediCOVID from Luxembourg (n = 133), MiRCOVID from Germany (n = 65), COVID19_OMICS‐COVIRNA from Italy (n = 75), and TOCOVID from Spain (n = 195). A total of 463 datasets were available for the analysis.

**FIGURE 2**
Machine learning workflow. Machine learning workflow using (A) balanced dataset and (B) imbalanced dataset.

**FIGURE 3**
Feature selection. (A) Six features were selected as best predictors of COVID‐19 severity in more than 90 out of 100 iterations: age, SEQ0548 (LINC01088‐201), SEQ0817 (FGD5‐AS1), SEQ1056 (LINC01088‐209), SEQ3051 (lncCOVIRNA1), and SEQ1321 (AKAP13‐SI). The line plot shows the top 10 selected features. X‐axis: feature names: SEQXXXX is the code of the probe of the FIMICS panel. SEQ0548 and SEQ1056 probes recognise two different isoforms of the same gene LINC01088 (the former LINC01088‐201, and latter LINC01088‐209), SEQ0817 recognises FGD5‐AS1, SEQ3051 recognises an unannotated lncRNA (i.e. lncCOVIRNA1), and SEQ1321 recognises AKAP13‐SI. Y‐axis: the number of times a feature appeared in the 100 iterations of the feature selection process. (B) GLMNet and SS methods used to cross‐validate the selected features. The probability of selection of predictors plotted against the values of the regression coefficients (ß) for the leave‐one‐out cross‐validated GLMNet model. Each point represents a unique predictor. In the plot, the X‐axis represents the values of the regression coefficients of the predictors, where nonzero values indicate selection by the GLMNet model. The Y‐axis represents the frequentist probability of predictor selection when running a SS model. The probabilities of the features selected by the Boruta method are as follows: age (.95), LINC01088‐201 (.93), lncCOVIRNA1 (.71), LINC01088‐209 (.47), AKAP13‐SI (.29) and FGD5‐AS1 (.01).

**FIGURE 4**
Comparison of selected features between stable and critical patients. Box/violin plots for (A) age, and expression of: (B) LINC01088‐201, (C) FGD5‐AS1, (D) LINC01088‐209, (E) lncCOVIRNA1, and (F) AKAP13‐SI showing regulations in the critical group of the merged cohort (n = 101) as compared to the group of stable patients (n = 362). p Value is from Student's t test. Boxes are drawn from Q1 (25th percentile) to Q3 (75th percentile) with a horizontal line inside it to denote the median. The length of the whiskers indicates 1.5 times of IQR (interquartile range Q3–Q1).

See this image and copyright information in PMC

References

1. World Health Organization . WHO COVID‐19 Dashboard. data.who.int. http://data.who.int/dashboards/covid19/cases
1. Thaweethai T, Jolley SE, Karlson EW, et al. Development of a definition of postacute sequelae of SARS‐CoV‐2 infection. JAMA. 2023;329:1934‐1946. - PMC - PubMed
1. Caporali A, Anwar M, Devaux Y, et al. Non‐coding RNAs as therapeutic targets and biomarkers in ischaemic heart disease. Nat Rev Cardiol 2024:1‐18, doi: 10.1038/s41569-024-01001-5 - DOI - PubMed
1. Badimon L, Devaux Y. Transcriptomics research to improve cardiovascular healthcare. Eur Heart J. 2020;41:3296‐3298. - PubMed
1. Gomes CPC, Ágg B, Andova A, et al. Catalyzing transcriptomics research in cardiovascular disease: the CardioRNA COST Action CA17129. Noncoding RNA. 2019;5:31. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- PubMed Central
- Wiley
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prediction of COVID-19 severity using machine learning

Affiliations

Prediction of COVID-19 severity using machine learning

Authors

Affiliations

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical