Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 19;20(3):e0318219.
doi: 10.1371/journal.pone.0318219. eCollection 2025.

Ensemble-based multiclass lung cancer classification using hybrid CNN-SVD feature extraction and selection method

Affiliations

Ensemble-based multiclass lung cancer classification using hybrid CNN-SVD feature extraction and selection method

Md Sabbir Hossain et al. PLoS One. .

Abstract

Lung cancer (LC) is a leading cause of cancer-related fatalities worldwide, underscoring the urgency of early detection for improved patient outcomes. The main objective of this research is to harness the noble strategies of artificial intelligence for identifying and classifying lung cancers more precisely from CT scan images at the early stage. This study introduces a novel lung cancer detection method, which was mainly focused on Convolutional Neural Networks (CNN) and was later customized for binary and multiclass classification utilizing a publicly available dataset of chest CT scan images of lung cancer. The main contribution of this research lies in its use of a hybrid CNN-SVD (Singular Value Decomposition) method and the use of a robust voting ensemble approach, which results in superior accuracy and effectiveness for mitigating potential errors. By employing contrast-limited adaptive histogram equalization (CLAHE), contrast-enhanced images were generated with minimal noise and prominent distinctive features. Subsequently, a CNN-SVD-Ensemble model was implemented to extract important features and reduce dimensionality. The extracted features were then processed by a set of ML algorithms along with a voting ensemble approach. Additionally, Gradient-weighted Class Activation Mapping (Grad-CAM) was integrated as an explainable AI (XAI) technique for enhancing model transparency by highlighting key influencing regions in the CT scans, which improved interpretability and ensured reliable and trustworthy results for clinical applications. This research offered state-of-the-art results, which achieved remarkable performance metrics with an accuracy, AUC, precision, recall, F1 score, Cohen's Kappa and Matthews Correlation Coefficient (MCC) of 99.49%, 99.73%, 100%, 99%, 99%, 99.15% and 99.16%, respectively, addressing the prior research gaps and setting a new benchmark in the field. Furthermore, in binary class classification, all the performance indicators attained a perfect score of 100%. The robustness of the suggested approach offered more reliable and impactful insights in the medical field, thus improving existing knowledge and setting the stage for future innovations.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Structure of the proposed hybrid CNN-SVD-based ensemble model for lung cancer classification.
Fig 2
Fig 2. Lung cancer CT images before and after pre-processing.
Fig 3
Fig 3. CNN structure for feature extraction.
Fig 4
Fig 4. (A) Accuracy and (B) loss of training and validation sets of multiclass classification using CNN.
Fig 5
Fig 5. Graphical comparison of multiclass lung cancer classification with different approaches.
Fig 6
Fig 6. Normalized confusion matrix of the proposed method for multiclass classification: (A) GNB, (B) GBM, (C) KNN, (D) SVM, (E) RF, (F) Ensemble LEARNING.
Fig 7
Fig 7. ROC curves of the proposed method for multiclass classification: (A) GNB, (B) GBM, (C) KNN, (D) SVM, (E) RF, and (F) Ensemble learning.
Fig 8
Fig 8. Performance comparison between the proposed method and various TL models with SVM.
Fig 9
Fig 9. (a) Accuracy and (b) loss curves of training and validation during binary class classification for 512 features extracted by the CNN.
Fig 10
Fig 10. Graphical comparison of binary class lung cancer classification with different approaches.
Fig 11
Fig 11. Normalized confusion matrix of the proposed method for multiclass classification: (A) GNB, (B) GBM, (C) KNN, (D) SVM, (E) RF, and (F) Ensemble learning.
Fig 12
Fig 12. ROC curves of the proposed method for binary class classification: (A) GNB, (B) GBM, (C) KNN, (D) SVM, (E) RF, and (F) Ensemble learning.
Fig 13
Fig 13. Visualization of color maps using Grad-CAM for different lung cancer classes.

References

    1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al.. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49. doi: 10.3322/caac.21660 - DOI - PubMed
    1. Lyu J, Bi X, Ling SH. Multi-level cross residual network for lung nodule classification. Sensors (Basel). 2020;20(10):2837. doi: 10.3390/s20102837 - DOI - PMC - PubMed
    1. Han Y, Ma Y, Wu Z, Zhang F, Zheng D, Liu X, et al.. Histologic subtype classification of non-small cell lung cancer using PET/CT images. Eur J Nucl Med Mol Imaging. 2021;48(2):350–60. doi: 10.1007/s00259-020-04771-5 - DOI - PubMed
    1. Lian S, Huang Y, Yang H, Zhao H. Serum carbohydrate antigen 12-5 level enhances the prognostic value in primary adenosquamous carcinoma of the lung: a two-institutional experience. Interact Cardiovasc Thorac Surg. 2016;22(4):419–24. doi: 10.1093/icvts/ivv369 - DOI - PubMed
    1. Maeda H, Matsumura A, Kawabata T, Suito T, Kawashima O, Watanabe T, et al.. Adenosquamous carcinoma of the lung: surgical results as compared with squamous cell and adenocarcinoma cases. Eur J Cardiothorac Surg. 2012;41(2):357–61. doi: 10.1016/j.ejcts.2011.05.050 - DOI - PubMed