Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 1:86:103386.
doi: 10.1016/j.eclinm.2025.103386. eCollection 2025 Aug.

Development and validation of an explainable machine learning model for predicting postoperative pulmonary complications after lung cancer surgery: a machine learning study

Affiliations

Development and validation of an explainable machine learning model for predicting postoperative pulmonary complications after lung cancer surgery: a machine learning study

Shaolin Chen et al. EClinicalMedicine. .

Abstract

Background: Early identification and prediction of postoperative pulmonary complications (PPCs) are vital for patient management in lung cancer (LC) surgery. However, existing predictive models often lack comprehensive validation and interpretability. This study aimed to develop and validate an explainable machine learning (ML) model to predict PPCs in patients with LC undergoing surgery.

Methods: A risk factor variable pool was determined by meta-analysis and Delphi surveys. Patients undergoing LC surgery who were admitted to the Thoracic Surgery Department at the Affiliated Hospital of Zunyi Medical University from 1st January 2022 to 31st October 2023 (retrospective) and from 1st November 2023 to 31st July 2024 (prospective) were used for model development and prospective validation, respectively. The retrospective cohort was randomly split into a training set and an internal validation set at an 8:2 ratio. Feature selection involved univariate analysis, collinearity analysis, nine ML algorithms, and expert consensus. Twelve independent ML models and 26 stacking ensemble models were developed. Predictive performance was evaluated using the area under the receiver-operating-characteristic curve (AUROC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1 score. Prospective validation was analysed using AUC, Hosmer-Lemeshow test, calibration curves, and decision curve analysis (DCA). The Shapley Additive Explanation (SHAP) method was utilised to interpret the predictive model.

Findings: A total of 883 patients were included in the retrospective cohort with an incidence of PPCs of 35.4% (313/883), and a total of 308 patients were included in the prospective cohort with PPCs of 29.5% (91/308). Nine key characteristics were selected for model development: age, duration of surgery, Charson comorbidity index (CCI), tumour stage, measured carbon monoxide diffusion (DLCO, mmol/min/kPa), intra-operative infusion volume (IFIV, mL), red blood cell volume distribution width-coefficient of variation (RDW-CV, %), body mass index (BMI), and number of years of smoking. Amongst the independent models, the Gradient Boosting Decision Tree (GBDT) showed best performance, achieving an AUROC of 0.829 (95% CI: 0.774-0.885). The stacking ensemble combining Support Vector Machine (SVM) and Decision Tree (DT) showed the highest overall performance, with an AUROC of 0.860 (95% CI: 0.809-0.911), and DCA showed higher clinical utility compared to other models. In the prospective validation, the AUROC was 0.790 (95% CI: 0.744-0.835).

Interpretation: The stacking ensemble model combining SVM and DT demonstrated robust predictive performance and favourable clinical utility for prediction PPCs in patients undergoing LC surgery. However, the model has not been applied in clinical practice and requires future validation in large, multi-centre cohorts. Further work should aim to identify high-risk patients early through clinical data analysis, enabling timely interventions and more efficient allocation of limited healthcare resources.

Funding: The Science and Technology Foundation of Guizhou Provincial Health Commission; the Key Talent Team of Guizhou Provincial Science and Technology Innovation; and Guizhou Science and Technology Cooperation Basic Research Project.

Keywords: Lung cancer; Machine learning; Postoperative pulmonary complications; Stacking ensemble model.

PubMed Disclaimer

Conflict of interest statement

All authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Research procedure flowchart.
Fig. 2
Fig. 2
Receiver operating characteristic curves of score for 12 machine learning models. (A) Boruta; (B) BPNN, Back-Propagation Neural Network; (C) DT, Decision Tree; (D) GBDT, Gradient Boosting Decision Tree; (E) GNB, Gaussian Naïve Bayesian; (F) KNN, k-Nearest Neighbor; (G) LightGBM, Light Gradient Boosting Machine; (H) Logit, Logistic Regression; (I) PLSDA, Partial Least Squares Discriminant Analysis; (J) RF, Random Forest; (K) SVM, Support Vector Machine; (L) XGBoost, Extreme Gradient Boosting; (M) The comparison of 12 ML models performance.
Fig. 2
Fig. 2
Receiver operating characteristic curves of score for 12 machine learning models. (A) Boruta; (B) BPNN, Back-Propagation Neural Network; (C) DT, Decision Tree; (D) GBDT, Gradient Boosting Decision Tree; (E) GNB, Gaussian Naïve Bayesian; (F) KNN, k-Nearest Neighbor; (G) LightGBM, Light Gradient Boosting Machine; (H) Logit, Logistic Regression; (I) PLSDA, Partial Least Squares Discriminant Analysis; (J) RF, Random Forest; (K) SVM, Support Vector Machine; (L) XGBoost, Extreme Gradient Boosting; (M) The comparison of 12 ML models performance.
Fig. 3
Fig. 3
The comprehensive rank of five independent base ML models and various stacking ensemble combinations. (A) For training set; (B) for internal validation set; Logit, Logistic Regression; RF, Random Forest; DT, Decision Tree; GBDT, Gradient Boosting Decision Tree; SVM, Support Vector Machine.
Fig. 4
Fig. 4
Prospective cohort patient screening flowchart.
Fig. 5
Fig. 5
Global model explanation using SHAP method. (A) Bar plot of SHAP summary; (B) dot plot of SHAP summary. The likelihood of developing PPCs increases with the SHAP value of each feature. Each dot presents the SHAP value for a specific patient, with one dot per feature. The colour of the dots indicates the feature value for each patient, with red representing higher values and blue representing lower values. The vertical stacking of dots shows the density of values. (C) SHAP dependence plot. Each plot illustrates how an individual feature affects the model's output, with each dot representing one patient. The SHAP values are shown on the y-axis, and the actual feature values are on the x-axis. When the SHAP value for a feature exceeds zero, the decision is pushed towards the ‘PPCs’ class. BMI: body mass index; CCI: Charson comorbidity index; DLCO (mmol/min/kPa): measured diffusing capacity of the lung for carbon monoxide; TS: tumour stage; SD (min): surgery duration; IFIV (mL): intraoperative fluid infusion volume; YS (year): year of smoking; RDW-CV (%): red cell distribution width-coefficient of variation.
Fig. 6
Fig. 6
Local model explanation by SHAP method. Risks contributed by each feature for individual patient at high (A) or low (B) risk of developing PPCs: A represented an individual patient towards the ‘PPCs’ class and B represented an individual patient towards the ‘non-PPCs’ class. BMI: body mass index; CCI: Charson comorbidity index; DLCO (mmol/min/kPa): measured diffusing capacity of the lung for carbon monoxide; TS: tumour stage; SD (min): surgery duration; IFIV (mL): intraoperative fluid infusion volume; YS (year): year of smoking; RDW-CV (%): red cell distribution width-coefficient of variation.

References

    1. Bray F., Laversanne M., Sung H., et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74(3):229–263. - PubMed
    1. Cao W., Chen H.D., Yu Y.W., Li N., Chen W.Q. Changing profiles of cancer burden worldwide and in China: a secondary analysis of the global cancer statistics 2020. Chin Med J (Engl) 2021;134(7):783–791. - PMC - PubMed
    1. Kidane B., Bott M., Spicer J., et al. The American Association for Thoracic Surgery (AATS) 2023 Expert Consensus Document: staging and multidisciplinary management of patients with early-stage non-small cell lung cancer. J Thorac Cardiovasc Surg. 2023;166(3):637–654. - PubMed
    1. Mao X., Zhang W., Ni Y.Q., Niu Y., Jiang L.Y. A prediction model for postoperative pulmonary complication in pulmonary function-impaired patients following lung resection. J Multidiscip Healthc. 2021;14:3187–3194. - PMC - PubMed
    1. Deng T., Song J., Tuo J., et al. Incidence and risk factors of pulmonary complications after lung cancer surgery: a systematic review and meta-analysis. Heliyon. 2024;10(12) - PMC - PubMed

LinkOut - more resources