Topological data analysis and machine learning for COVID-19 detection in CT scan lung images
- PMID: 40170109
- PMCID: PMC11963280
- DOI: 10.1186/s42490-025-00089-1
Topological data analysis and machine learning for COVID-19 detection in CT scan lung images
Abstract
COVID-19 has claimed the lives of thousands over the past years. Although pathogenic laboratory testing is the established standard, it carries a significant drawback with a notable rate of false negatives. Consequently, there is an urgent need for alternative diagnostic approaches to combat this threat. In response to this pressing need for accurate and parameter-free methods for COVID-19 identification, particularly within lung images, we introduce a novel approach that combines the principles of topological data analysis with the capabilities of machine learning. Our proposed methodology entails the extraction of persistent homology features from lung images, effectively capturing the intrinsic topological properties inherent in the data. These extracted persistent homology features then serve as inputs for various machine learning methods employed for classification purposes. Our primary objective is to achieve exceptional accuracy in the detection of COVID-19 all while showcasing the effectiveness of these topological features. The experimental results demonstrate that the Random Forest Classifier and the Support Vector Machine models outperform the rest, showcasing their effectiveness in classifying CT scan lung images with remarkable precision-an accuracy rate of 97.5% for the Random Forest model and an AUC score that surpasses 0.99 for the SVM. Results of the model on the same data after exclusion of the topological features and on other data with application of the same model with topological features showed the efficiency of these features in the classification task.
Keywords: COVID-19 detection; Lung images; Machine learning; Topological data analysis.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Ethics approval and consent to participate: The need for ethics approval is deemed unnecessary according to national regulations, no relevant legislation exists in the country of the origin of the data. The data was anonymized prior to use, and all reasonable steps were taken to ensure confidentiality and adherence to ethical standards for research involving medical data. Accordance: As clinical/pathological data is analyzed in this study, we confirm that all methods were performed in accordance with the Declaration of Helsinki. Consent for publication: Not applicable Competing interests: The authors declare no competing interests.
Figures
References
-
- Hayden MK, et al. The infectious diseases Society of America Guidelines on the Diagnosis of COVID-19: antigen Testing. Clinl Infect Dis. 2023. 10.1093/cid/ciad032. - PubMed
LinkOut - more resources
Full Text Sources
Miscellaneous