Differentiating novel coronavirus pneumonia from general pneumonia based on machine learning
- PMID: 32814568
- PMCID: PMC7436068
- DOI: 10.1186/s12938-020-00809-9
Differentiating novel coronavirus pneumonia from general pneumonia based on machine learning
Abstract
Background: Chest CT screening as supplementary means is crucial in diagnosing novel coronavirus pneumonia (COVID-19) with high sensitivity and popularity. Machine learning was adept in discovering intricate structures from CT images and achieved expert-level performance in medical image analysis.
Methods: An integrated machine learning framework on chest CT images for differentiating COVID-19 from general pneumonia (GP) was developed and validated. Seventy-three confirmed COVID-19 cases were consecutively enrolled together with 27 confirmed general pneumonia patients from Ruian People's Hospital, from January 2020 to March 2020. To accurately classify COVID-19, region of interest (ROI) delineation was implemented based on ground-glass opacities (GGOs) before feature extraction. Then, 34 statistical texture features of COVID-19 and GP ROI images were extracted, including 13 gray-level co-occurrence matrix (GLCM) features, 15 gray-level-gradient co-occurrence matrix (GLGCM) features and 6 histogram features. High-dimensional features impact the classification performance. Thus, ReliefF algorithm was leveraged to select features. The relevance of each feature was the average weights calculated by ReliefF in n times. Features with relevance larger than the empirically set threshold T were selected. After feature selection, the optimal feature set along with 4 other selected feature combinations for comparison were applied to the ensemble of bagged tree (EBT) and four other machine learning classifiers including support vector machine (SVM), logistic regression (LR), decision tree (DT), and K-nearest neighbor with Minkowski distance equal weight (KNN) using tenfold cross-validation.
Results and conclusions: The classification accuracy (ACC), sensitivity (SEN), specificity (SPE) of our proposed method yield 94.16%, 88.62% and 100.00%, respectively. The area under the receiver operating characteristic curve (AUC) was 0.99. The experimental results indicate that the EBT algorithm with statistical textural features based on GGOs for differentiating COVID-19 from general pneumonia achieved high transferability, efficiency, specificity, sensitivity, and impressive accuracy, which is beneficial for inexperienced doctors to more accurately diagnose COVID-19 and essential for controlling the spread of the disease.
Keywords: Chest CT; General pneumonia; Machine learning; Novel coronavirus pneumonia.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures
References
-
- Li D, Wang D, Dong J, Wang N, Huang H, Xu H, Xia C. False-negative results of real-time reverse-transcriptase polymerase chain reaction for severe acute respiratory syndrome coronavirus 2: role of deep-learning-based CT diagnosis and insights from two cases. Korean J Radiol. 2020;21(4):505–508. - PMC - PubMed
-
- Cheng Z, Lu Y, Cao Q, Qin L, Pan Z, Yan F, Clinical Yang W. Features and chest CT manifestations of coronavirus disease, (COVID-19) in a single-center study in Shanghai, China. AJR Am J Roentgenol. 2019;2020:1–6. - PubMed
-
- Li CX, Wu B, Luo F, Zhang N. Clinical Study and CT Findings of a Familial Cluster of Pneumonia with Coronavirus Disease 2019 (COVID-19) Sichuan Da Xue Xue Bao Yi Xue Ban. 2020;51(2):155–158. - PubMed
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical
