Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Multicenter Study
. 2021 Oct;31(10):7925-7935.
doi: 10.1007/s00330-021-07957-z. Epub 2021 Apr 15.

Machine learning based on clinical characteristics and chest CT quantitative measurements for prediction of adverse clinical outcomes in hospitalized patients with COVID-19

Affiliations
Multicenter Study

Machine learning based on clinical characteristics and chest CT quantitative measurements for prediction of adverse clinical outcomes in hospitalized patients with COVID-19

Zhichao Feng et al. Eur Radiol. 2021 Oct.

Abstract

Objectives: To develop and validate a machine learning model for the prediction of adverse outcomes in hospitalized patients with COVID-19.

Methods: We included 424 patients with non-severe COVID-19 on admission from January 17, 2020, to February 17, 2020, in the primary cohort of this retrospective multicenter study. The extent of lung involvement was quantified on chest CT images by a deep learning-based framework. The composite endpoint was the occurrence of severe or critical COVID-19 or death during hospitalization. The optimal machine learning classifier and feature subset were selected for model construction. The performance was further tested in an external validation cohort consisting of 98 patients.

Results: There was no significant difference in the prevalence of adverse outcomes (8.7% vs. 8.2%, p = 0.858) between the primary and validation cohorts. The machine learning method extreme gradient boosting (XGBoost) and optimal feature subset including lactic dehydrogenase (LDH), presence of comorbidity, CT lesion ratio (lesion%), and hypersensitive cardiac troponin I (hs-cTnI) were selected for model construction. The XGBoost classifier based on the optimal feature subset performed well for the prediction of developing adverse outcomes in the primary and validation cohorts, with AUCs of 0.959 (95% confidence interval [CI]: 0.936-0.976) and 0.953 (95% CI: 0.891-0.986), respectively. Furthermore, the XGBoost classifier also showed clinical usefulness.

Conclusions: We presented a machine learning model that could be effectively used as a predictor of adverse outcomes in hospitalized patients with COVID-19, opening up the possibility for patient stratification and treatment allocation.

Key points: • Developing an individually prognostic model for COVID-19 has the potential to allow efficient allocation of medical resources. • We proposed a deep learning-based framework for accurate lung involvement quantification on chest CT images. • Machine learning based on clinical and CT variables can facilitate the prediction of adverse outcomes of COVID-19.

Keywords: Artificial intelligence; COVID-19; Prognosis; Tomography, X-ray computed.

PubMed Disclaimer

Conflict of interest statement

The authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article.

Figures

Fig. 1
Fig. 1
Study workflow. (I) Non-severe COVID-19 patients who underwent chest CT scan on admission were included. (II) Lung and lesion segmentation were performed using DL-based framework and texture clustering was used to distinguish between GGO and CON. CT quantitative measurements including lesion%, GGO%, and CON% were calculated. (III) The optimal machine learning classifier and feature subset were selected and used for prediction model construction. (IV) The performance of the machine learning model was determined and validated in an external cohort. CON, consolidation; COVID-19, coronavirus disease 2019; CT, computed tomography; DL, deep learning; GGO, ground-glass opacification; LR, logistic regression; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting
Fig. 2
Fig. 2
DL-based lung and lesion segmentation and CT quantitative measurements. a The original CT images, lung segmentation, and lesion segmentation of 3 example cases. b The contours of 3 radiologists and lesion DL-based segmentation (left) and the uncertain region (right). c ROC curve of the pixel-level performance of DL-based segmentation to identify the lesion. d Unsupervised multi-scale texture feature clustering to distinguish between GGO and CON based on grey-level attenuation and LBP features. e t-SNE plot showing the pixel-level GGO or CON distribution. CON, consolidation; CT, computed tomography; DL, deep learning; GGO, ground-glass opacification; LBP, local binary pattern; ROC, receiver operating characteristic; t-SNE, t-distributed stochastic neighbour embedding
Fig. 3
Fig. 3
Optimal machine learning classifier and feature subset selection. a The heatmap illustrating the correlations between features in the candidate feature set. b The performance of five machine learning classifiers, including LR, SVM-Linear, SVM-RBF, RF, and XGBoost, based on the candidate feature set in the primary cohort (left) and validation cohort (right). c The feature importance rank in the XGBoost classifier using fivefold cross-validation in the primary cohort. d The relationship between the feature subset size and model performance. The optimal size (red dot) was determined with the highest average AUC and a minimal number of features. The optimal feature subset contained the top 4 features, i.e. LDH, presence of comorbidity, lesion%, and hs-cTnI. AST, aspartate aminotransferase; AUC, area under the receiver operating characteristic curve; BUN, blood urea nitrogen; CRP, C-reactive protein; GGO, ground-glass opacification; hs-cTnI, hypersensitive cardiac troponin I; LDH, lactic dehydrogenase; LR, logistic regression; PaO2, partial pressure of oxygen; RF, random forest; SVM-Linear, support vector machine with a linear kernel; SVM-RBF, support vector machine with a radial basis function; XGBoost, extreme gradient boosting
Fig. 4
Fig. 4
Performance of the XGBoost classifiers based on the top four features or only three clinical features. a ROC curves of the XGBoost classifiers in the primary cohort (left) and validation cohort (right). b Comparison of decision curves of the XGBoost classifiers in the whole cohort. AUC, area under the receiver operating characteristic curve; ROC, receiver operating characteristic; XGBoost, extreme gradient boosting

Similar articles

Cited by

References

    1. Guan WJ, Ni ZY, Hu Y, et al. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med. 2020;382:1708–1720. doi: 10.1056/NEJMoa2002032. - DOI - PMC - PubMed
    1. Yang X, Yu Y, Xu J, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med. 2020;8:475–481. doi: 10.1016/S2213-2600(20)30079-5. - DOI - PMC - PubMed
    1. Feng Y, Ling Y, Bai T, et al. COVID-19 with different severities: a multicenter study of clinical features. Am J Respir Crit Care Med. 2020;201:1380–1388. doi: 10.1164/rccm.202002-0445OC. - DOI - PMC - PubMed
    1. Wang Y, Zhang D, Du G, et al. Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial. Lancet. 2020;395:1569–1578. doi: 10.1016/S0140-6736(20)31022-9. - DOI - PMC - PubMed
    1. Grein J, Ohmagari N, Shin D, et al. Compassionate use of remdesivir for patients with severe Covid-19. N Engl J Med. 2020;382:2327–2336. doi: 10.1056/NEJMoa2007016. - DOI - PMC - PubMed

Publication types