Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer
- PMID: 36278315
- PMCID: PMC9715822
- DOI: 10.1111/1759-7714.14694
Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer
Abstract
Background: The present study aimed to evaluate the performance of several machine learning (ML) algorithms in predicting 1-year afatinib continuation and 2-year survival after afatinib initiation and to identify the differences in survival outcomes between ML-classified strata.
Methods: Data that were also used in the RESET study were retrospectively collected from 16 hospitals in South Korea. A stratified random sampling method was applied to split the data into training and test sets (70:30 split ratio). Clinical information, such as age, sex, tumor stage, smoking, performance status, metastasis, type of metastasis, dose adjustment, and pathologic information on EGFR mutations were inputted. Training was performed using eight ML algorithms: logistic regression, decision tree, deep neural network, random forest, support vector machine, boosting, bagging, and the naïve Bayes classifier. The model performance was assessed based on sensitivity, specificity, and accuracy. Area under the receiver operator characteristic curve (AUC) was calculated and compared between the ML models using DeLong's test. A Kaplan-Meier (KM) curve was used to visualize the identified strata obtained from the ML models.
Results: No significant differences in the input variables were observed between the training and test datasets. The best-performing models were support vector machine in predicting 1-year afatinib continuation (AUC 0.626) and decision tree in 2-year survival after afatinib start (AUC 0.644), although the performances of the ML models were comparable and did not display any predictive roles. KM analysis and log-rank test revealed significant differences between the strata identified from the ML model (p < 0.001) in terms of both time-on-treatment (TOT) and overall survival (OS).
Conclusion: The performances of ML models in our study found no discernible roles in predicting afatinib-related outcomes, although the identified strata revealed different TOT and OS in the KM analysis. This implies the strength of ML in predicting the survival outcome, as well as the limitation of electronic medical record-based variables in ML algorithms. Careful consideration of variable inclusion is likely to improve the general model performance.
Keywords: NSCLC; machine learning; outcome; survival.
© 2022 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Figures



Similar articles
-
Feasibility and effectiveness of afatinib for poor performance status patients with EGFR-mutation-positive non-small-cell lung cancer: a retrospective cohort study.BMC Cancer. 2021 Jul 27;21(1):859. doi: 10.1186/s12885-021-08587-w. BMC Cancer. 2021. PMID: 34315431 Free PMC article.
-
First-line Afatinib in Patients With Non-small-cell Lung Cancer With Uncommon EGFR Mutations in South Korea.Anticancer Res. 2022 Mar;42(3):1615-1622. doi: 10.21873/anticanres.15636. Anticancer Res. 2022. PMID: 35220259
-
Effectiveness and Tolerability of First-Line Afatinib for Advanced EGFR-Mutant Non-Small Cell Lung Cancer in Vietnam.Asian Pac J Cancer Prev. 2021 May 1;22(5):1581-1590. doi: 10.31557/APJCP.2021.22.5.1581. Asian Pac J Cancer Prev. 2021. PMID: 34048189 Free PMC article.
-
Machine learning in prediction of epidermal growth factor receptor status in non-small cell lung cancer brain metastases: a systematic review and meta-analysis.BMC Cancer. 2025 May 1;25(1):818. doi: 10.1186/s12885-025-14221-w. BMC Cancer. 2025. PMID: 40312289 Free PMC article.
-
Afatinib in the first-line treatment of patients with non-small cell lung cancer: clinical evidence and experience.Ther Adv Respir Dis. 2018 Jan-Dec;12:1753466618808659. doi: 10.1177/1753466618808659. Ther Adv Respir Dis. 2018. PMID: 30355049 Free PMC article. Review.
Cited by
-
Predicting early stage lung cancer recurrence and survival from combined tumor motion amplitude and radiomics on free-breathing 4D-CT.Med Phys. 2025 Mar;52(3):1926-1940. doi: 10.1002/mp.17586. Epub 2024 Dec 20. Med Phys. 2025. PMID: 39704505 Free PMC article.
-
Multimodal prediction of tyrosine kinase inhibitors therapy outcomes in advanced EGFR-mutated NSCLC patients.J Transl Med. 2025 Aug 18;23(1):933. doi: 10.1186/s12967-025-06956-8. J Transl Med. 2025. PMID: 40826408 Free PMC article.
References
-
- Huang J, Deng Y, Tin MS, Lok V, Ngai CH, Zhang L, et al. Distribution, risk factors, and temporal trends for lung cancer incidence and mortality: a global analysis. Chest. 2022;161(4):1101–11. - PubMed
-
- Mathias C, Prado GF, Mascarenhas E, Ugalde PA, Zimmer Gelatti AC, Carvalho ES, et al. Lung cancer in Brazil. J Thorac Oncol. 2020;15(2):170–5. - PubMed
-
- Gould MK, Huang BZ, Tammemagi MC, Kinar Y, Shiff R. Machine learning for early lung cancer identification using routine clinical and laboratory data. Am J Respir Crit Care Med. 2021;204(4):445–53. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Medical
Research Materials
Miscellaneous