Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 19;19(1):66.
doi: 10.1186/s12938-020-00809-9.

Differentiating novel coronavirus pneumonia from general pneumonia based on machine learning

Affiliations

Differentiating novel coronavirus pneumonia from general pneumonia based on machine learning

Chenglong Liu et al. Biomed Eng Online. .

Abstract

Background: Chest CT screening as supplementary means is crucial in diagnosing novel coronavirus pneumonia (COVID-19) with high sensitivity and popularity. Machine learning was adept in discovering intricate structures from CT images and achieved expert-level performance in medical image analysis.

Methods: An integrated machine learning framework on chest CT images for differentiating COVID-19 from general pneumonia (GP) was developed and validated. Seventy-three confirmed COVID-19 cases were consecutively enrolled together with 27 confirmed general pneumonia patients from Ruian People's Hospital, from January 2020 to March 2020. To accurately classify COVID-19, region of interest (ROI) delineation was implemented based on ground-glass opacities (GGOs) before feature extraction. Then, 34 statistical texture features of COVID-19 and GP ROI images were extracted, including 13 gray-level co-occurrence matrix (GLCM) features, 15 gray-level-gradient co-occurrence matrix (GLGCM) features and 6 histogram features. High-dimensional features impact the classification performance. Thus, ReliefF algorithm was leveraged to select features. The relevance of each feature was the average weights calculated by ReliefF in n times. Features with relevance larger than the empirically set threshold T were selected. After feature selection, the optimal feature set along with 4 other selected feature combinations for comparison were applied to the ensemble of bagged tree (EBT) and four other machine learning classifiers including support vector machine (SVM), logistic regression (LR), decision tree (DT), and K-nearest neighbor with Minkowski distance equal weight (KNN) using tenfold cross-validation.

Results and conclusions: The classification accuracy (ACC), sensitivity (SEN), specificity (SPE) of our proposed method yield 94.16%, 88.62% and 100.00%, respectively. The area under the receiver operating characteristic curve (AUC) was 0.99. The experimental results indicate that the EBT algorithm with statistical textural features based on GGOs for differentiating COVID-19 from general pneumonia achieved high transferability, efficiency, specificity, sensitivity, and impressive accuracy, which is beneficial for inexperienced doctors to more accurately diagnose COVID-19 and essential for controlling the spread of the disease.

Keywords: Chest CT; General pneumonia; Machine learning; Novel coronavirus pneumonia.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Samples of COVID-19 and GP CT images. Picture a is the CT image of COVID-19 with bilateral GGOs while picture b is the CT image of GP with unilateral GGO. The red arrows point at the GGOs of COVID-19 and the blue arrow points at the GGO of GP
Fig. 2
Fig. 2
The weight curves of 34 features based on ReliefF algorithm. The X-axis represents the numbers of features. The Y axis represents the weights of different features at different times. The algorithm run 1000 times represented by curves with different colors. The dark straight line represents weight = 0.11, which is the proposed threshold T
Fig. 3
Fig. 3
Accuracy comparison of five classifiers with different feature combinations
Fig. 4
Fig. 4
Sensitivity comparison of five classifiers with different feature combinations
Fig. 5
Fig. 5
Specificity comparison of five classifiers with different feature combinations
Fig. 6
Fig. 6
Comparison of receiver operating characteristic curves for the proposed classifier, KNN, SVM, LR, and DT using feature combination 5. The receiver operating characteristic curves for the proposed EBT models had an AUC that was significantly greater than that for four other models
Fig. 7
Fig. 7
The flowchart of the proposed diagnosis framework

Similar articles

Cited by

References

    1. Li D, Wang D, Dong J, Wang N, Huang H, Xu H, Xia C. False-negative results of real-time reverse-transcriptase polymerase chain reaction for severe acute respiratory syndrome coronavirus 2: role of deep-learning-based CT diagnosis and insights from two cases. Korean J Radiol. 2020;21(4):505–508. - PMC - PubMed
    1. Cheng Z, Lu Y, Cao Q, Qin L, Pan Z, Yan F, Clinical Yang W. Features and chest CT manifestations of coronavirus disease, (COVID-19) in a single-center study in Shanghai, China. AJR Am J Roentgenol. 2019;2020:1–6. - PubMed
    1. Chung M, Bernheim A, Mei X, Zhang N, Huang M, Zeng X, Cui J, Xu W, Yang Y, Fayad ZA, Jacobi A, Li K, Li S, Shan H. CT imaging features of 2019 novel coronavirus (2019-nCoV) Radiology. 2020;295(1):202–207. - PMC - PubMed
    1. Li CX, Wu B, Luo F, Zhang N. Clinical Study and CT Findings of a Familial Cluster of Pneumonia with Coronavirus Disease 2019 (COVID-19) Sichuan Da Xue Xue Bao Yi Xue Ban. 2020;51(2):155–158. - PubMed
    1. Dai H, Zhang X, Xia J, Zhang T, Shang Y, Huang R, Liu R, Wang D, Li M, Wu J, Xu Q, Li Y. High-resolution Chest CT features and clinical characteristics of patients infected with COVID-19 in Jiangsu. Int J Infect Dis: China; 2020. - PMC - PubMed