Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Dec 18;16(24):4225.
doi: 10.3390/cancers16244225.

Integrative Stacking Machine Learning Model for Small Cell Lung Cancer Prediction Using Metabolomics Profiling

Affiliations

Integrative Stacking Machine Learning Model for Small Cell Lung Cancer Prediction Using Metabolomics Profiling

Md Shaheenur Islam Sumon et al. Cancers (Basel). .

Abstract

Background: Small cell lung cancer (SCLC) is an extremely aggressive form of lung cancer, characterized by rapid progression and poor survival rates. Despite the importance of early diagnosis, the current diagnostic techniques are invasive and restricted. Methods: This study presents a novel stacking-based ensemble machine learning approach for classifying small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) using metabolomics data. The analysis included 191 SCLC cases, 173 NSCLC cases, and 97 healthy controls. Feature selection techniques identified significant metabolites, with positive ions proving more relevant. Results: For multi-class classification (control, SCLC, NSCLC), the stacking ensemble achieved 85.03% accuracy and 92.47 AUC using Support Vector Machine (SVM). Binary classification (SCLC vs. NSCLC) further improved performance, with ExtraTreesClassifier reaching 88.19% accuracy and 92.65 AUC. SHapley Additive exPlanations (SHAP) analysis revealed key metabolites like benzoic acid, DL-lactate, and L-arginine as significant predictors. Conclusions: The stacking ensemble approach effectively leverages multiple classifiers to enhance overall predictive performance. The proposed model effectively captures the complementary strengths of different classifiers, enhancing the detection of SCLC and NSCLC. This work accentuates the potential of combining metabolomics with advanced machine learning for non-invasive early lung cancer subtype detection, offering an alternative to conventional biopsy methods.

Keywords: NSCLC; SCLC; machine learning; serum metabolomics; stacking ensemble model.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Proposed Stacking Ensemble Model.
Figure 2
Figure 2
Overview of the methodology employed in this study.
Figure 3
Figure 3
Features ranked using the XGBoost feature selection algorithm for multi-class classification.
Figure 4
Figure 4
Features ranked using the Extra Trees feature selection algorithm for binary classification.
Figure 5
Figure 5
Accuracy of top features for multiclass classification.
Figure 6
Figure 6
Top Feature Accuracy for Binary Classifications.
Figure 7
Figure 7
AUC-ROC curve for the stacking-based SVM classifier in multi-class classification.
Figure 8
Figure 8
AUC-ROC curve for the stacking-based ExtraTrees classifier in binary classification.
Figure 9
Figure 9
SHAP summary plot for the multi-class classification model.
Figure 10
Figure 10
SHAP summary plot for the binary classification model.
Figure 11
Figure 11
Local explanations of a representative sample are shown in two forms: (A) a force plot illustrating an SCLC prediction and (B) a waterfall plot displaying the same prediction.

Similar articles

References

    1. Bray F., Laversanne M., Sung H., Ferlay J., Siegel R.L., Soerjomataram I., Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J. Clin. 2024;74:229–263. doi: 10.3322/caac.21834. - DOI - PubMed
    1. Li C., Lei S., Ding L., Xu Y., Wu X., Wang H., Zhang Z., Gao T., Zhang Y., Li L. Global burden and trends of lung cancer incidence and mortality. Chin. Med. J. 2023;136:1583–1590. doi: 10.1097/CM9.0000000000002529. - DOI - PMC - PubMed
    1. Barta J.A., Powell C.A., Wisnivesky J.P. Global epidemiology of lung cancer. Ann. Glob. Health. 2019;85:8. doi: 10.5334/aogh.2419. - DOI - PMC - PubMed
    1. Shang X., Zhang C., Kong R., Zhao C., Wang H. Construction of a Diagnostic Model for Small Cell Lung Cancer Combining Metabolomics and Integrated Machine Learning. Oncologist. 2024;29:e392–e401. doi: 10.1093/oncolo/oyad261. - DOI - PMC - PubMed
    1. Ayoub M., AbuHaweeleh M.N., Mahmood N., Clelland C., Ayoub M.M., Saman H. Small cell lung cancer associated small bowel obstruction, a diagnostic conundrum: A case report. Clin. Case Rep. 2024;12:e9262. doi: 10.1002/ccr3.9262. - DOI - PMC - PubMed

LinkOut - more resources