Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 12:2025:6606842.
doi: 10.1155/cjid/6606842. eCollection 2025.

Machine Learning-Based Prediction of In-Hospital Mortality in Severe COVID-19 Patients Using Hematological Markers

Affiliations

Machine Learning-Based Prediction of In-Hospital Mortality in Severe COVID-19 Patients Using Hematological Markers

Rongrong Dong et al. Can J Infect Dis Med Microbiol. .

Abstract

Background: The mortality rate is very high in patients with severe COVID-19. Nearly 32% of COVID-19 patients are critically ill, with mortality rates ranging from 8.1% to 33%. Early risk factor detection makes it easier to get the right care and estimate the prognosis. This study aimed to develop and validate a model to predict the risk of mortality based on hematological parameters at hospital admission in patients with severe COVID-19. Methods: The study retrospectively collected clinical data and laboratory test results from 396 and 112 patients with severe COVID-19 in two tertiary care hospitals as Cohort 1 and Cohort 2, respectively. Cohort 1 was to train the model. The LASSO method was used to screen features. The models built by nine machine learning algorithms were compared to screen the best algorithm and model. The model was visualized using nomogram, followed by trend analyses, and finally subgroup analyses. Cohort 2 was for external validation. Results: In Cohort 1, the model developed by the LR algorithm performed the best, with an AUC of 0.852 (95% CI: 0.750-0.953). Five features were included in the model, namely, D-dimer, platelets, neutrophil count, lymphocyte count, and activated partial thromboplastin time. The mode had higher diagnostic accuracy in patients with severe COVID-19 > 65 years of age (AUC = 0.814), slightly lower than in patients with severe COVID-19 ≤ 65 years of age (AUC = 0.875). The ability of the model to predict the occurrence of mortality was validated in Cohort 2 (AUC = 0.841). Conclusions: The risk prediction model for mortality for patients with severe COVID-19 was constructed by the LR algorithm using only hematological parameters in this study. The model contributes to the timely and accurate stratification and management of patients with severe COVID-19.

Keywords: COVID-19; death; hematological parameters; machine learning; prediction model.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Flowchart of machine learning to build the predictive model. XgBoost, extreme gradient boosting; LR, logistic regression; RF, random forest; LightGBM, lightweight gradient boosting machine learning; AdaBoost, adaptive boosting algorithm; GNB, gaussian plain bayes; MLP, neural networks; SVM, support vector machines; KNN, k-nearest neighbor; AUC, area under the curve; ROC, receiver operator characteristic curve; DCA, decision curve analysis.
Figure 2
Figure 2
Clinical feature selection using LASSO regression. (a) Tuning parameter (λ) selection in the LASSO model used 10 fold cross validation via minimum criteria. Dotted vertical lines were drawn at the optimal values by using the minimum criteria and the 1 standard error of the minimum criteria (the 1-SE criteria). (b) LASSO coefficient profiles of the 29 clinical features. A coefficient profile plot was produced against the log (λ) sequence.
Figure 3
Figure 3
(a) The varying importance of the functions included in the mortality risk early warning model. (b) Nomogram to predict the mortality risk of severe COVID-19.
Figure 4
Figure 4
In the internal validation set, the model built by the LR algorithm was evaluated for its ability to predict the risk of death occurring in patients with severe COVID-19. (a) Calibration curve analysis. (b) Decision curve analysis.
Figure 5
Figure 5
Trends in the model's parameters. (a) Lymphocyte count. (b) Neutrophil count. (c) Platelets. (d) Activated partial thromboplastin time. (e) D-dimer.

Similar articles

References

    1. Huang C., Wang Y., Li X., et al. Clinical Features of Patients Infected With 2019 Novel Coronavirus in Wuhan, china. The Lancet . 2020;395(10223):497–506. doi: 10.1016/S0140-6736(20)30183-5. - DOI - PMC - PubMed
    1. Zhou F., Yu T., Du R., et al. Clinical Course and Risk Factors for Mortality of Adult Inpatients With Covid-19 in Wuhan, China: A Retrospective Cohort Study. The Lancet . 2020;395(10229):1054–1062. doi: 10.1016/S0140-6736(20)30566-3. - DOI - PMC - PubMed
    1. Lippi G., Plebani M. Laboratory Abnormalities in Patients With Covid-2019 Infection. Clinical Chemistry and Laboratory Medicine . 2020;58(7):1131–1134. doi: 10.1515/cclm-2020-0198. - DOI - PubMed
    1. Luo J., Zhou L., Feng Y., Li B., Guo S. The Selection of Indicators From Initial Blood Routine Test Results to Improve the Accuracy of Early Prediction of Covid-19 Severity. PLoS One . 2021;16(6):p. e0253329. doi: 10.1371/journal.pone.0253329. - DOI - PMC - PubMed
    1. Rijnberg F. M., Hazekamp M. G., Wentzel J. J., et al. Energetics of Blood Flow in Cardiovascular Disease: Concept and Clinical Implications of Adverse Energetics in Patients With a Fontan Circulation. Circulation . 2018;137(22):2393–2407. doi: 10.1161/CIRCULATIONAHA.117.033359. - DOI - PubMed

LinkOut - more resources