. 2022 Dec;13(23):3353-3361.

doi: 10.1111/1759-7714.14694. Epub 2022 Oct 24.

Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer

Taeyun Kim¹, Sang Jin Lee², Tae-Won Jang³

Affiliations

¹ Division of Pulmonology, Department of Internal Medicine, The Armed Forces Goyang Hospital, Goyang, Republic of Korea.
² Department of Statistics, Pusan National University, Busan, Republic of Korea.
³ Division of Pulmonology, Department of Internal Medicine, Kosin University College of Medicine, Kosin University Gospel Hospital, Busan, Republic of Korea.

PMID: 36278315
PMCID: PMC9715822
DOI: 10.1111/1759-7714.14694

Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer

Taeyun Kim et al. Thorac Cancer. 2022 Dec.

. 2022 Dec;13(23):3353-3361.

doi: 10.1111/1759-7714.14694. Epub 2022 Oct 24.

Authors

Taeyun Kim¹, Sang Jin Lee², Tae-Won Jang³

Affiliations

¹ Division of Pulmonology, Department of Internal Medicine, The Armed Forces Goyang Hospital, Goyang, Republic of Korea.
² Department of Statistics, Pusan National University, Busan, Republic of Korea.
³ Division of Pulmonology, Department of Internal Medicine, Kosin University College of Medicine, Kosin University Gospel Hospital, Busan, Republic of Korea.

PMID: 36278315
PMCID: PMC9715822
DOI: 10.1111/1759-7714.14694

Abstract

Background: The present study aimed to evaluate the performance of several machine learning (ML) algorithms in predicting 1-year afatinib continuation and 2-year survival after afatinib initiation and to identify the differences in survival outcomes between ML-classified strata.

Methods: Data that were also used in the RESET study were retrospectively collected from 16 hospitals in South Korea. A stratified random sampling method was applied to split the data into training and test sets (70:30 split ratio). Clinical information, such as age, sex, tumor stage, smoking, performance status, metastasis, type of metastasis, dose adjustment, and pathologic information on EGFR mutations were inputted. Training was performed using eight ML algorithms: logistic regression, decision tree, deep neural network, random forest, support vector machine, boosting, bagging, and the naïve Bayes classifier. The model performance was assessed based on sensitivity, specificity, and accuracy. Area under the receiver operator characteristic curve (AUC) was calculated and compared between the ML models using DeLong's test. A Kaplan-Meier (KM) curve was used to visualize the identified strata obtained from the ML models.

Results: No significant differences in the input variables were observed between the training and test datasets. The best-performing models were support vector machine in predicting 1-year afatinib continuation (AUC 0.626) and decision tree in 2-year survival after afatinib start (AUC 0.644), although the performances of the ML models were comparable and did not display any predictive roles. KM analysis and log-rank test revealed significant differences between the strata identified from the ML model (p < 0.001) in terms of both time-on-treatment (TOT) and overall survival (OS).

Conclusion: The performances of ML models in our study found no discernible roles in predicting afatinib-related outcomes, although the identified strata revealed different TOT and OS in the KM analysis. This implies the strength of ML in predicting the survival outcome, as well as the limitation of electronic medical record-based variables in ML algorithms. Careful consideration of variable inclusion is likely to improve the general model performance.

Keywords: NSCLC; machine learning; outcome; survival.

PubMed Disclaimer

Figures

**FIGURE 1**
ROC curves for the prediction of 1‐year afatinib continuation (a) and 2‐year survival after afatinib initiation (b)

**FIGURE 2**
The Kaplan–Meier curve for the time‐on‐treatment (TOT) according to the strata identified using several machine learning algorithms. The dotted lines indicate the time at which the probability drops to 0.5

**FIGURE 3**
The Kaplan–Meier curve for the overall survival (OS) according to the strata identified using several machine learning algorithms. The dotted lines indicate the time at which the probability drops to 0.5

See this image and copyright information in PMC

Cited by

Predicting early stage lung cancer recurrence and survival from combined tumor motion amplitude and radiomics on free-breathing 4D-CT.
Ouraou E, Tonneau M, Le WT, Filion E, Campeau MP, Vu T, Doucet R, Bahig H, Kadoury S. Ouraou E, et al. Med Phys. 2025 Mar;52(3):1926-1940. doi: 10.1002/mp.17586. Epub 2024 Dec 20. Med Phys. 2025. PMID: 39704505 Free PMC article.
Multimodal prediction of tyrosine kinase inhibitors therapy outcomes in advanced EGFR-mutated NSCLC patients.
Chai X, Li H, Yang M, Zeng J, Chen G, Li Y, Wang W, Liu Z, Li K, Zhang T, Wang S, Che N. Chai X, et al. J Transl Med. 2025 Aug 18;23(1):933. doi: 10.1186/s12967-025-06956-8. J Transl Med. 2025. PMID: 40826408 Free PMC article.

References

1. Jung KW, Won YJ, Hong S, Kong HJ, Lee ES. Prediction of cancer incidence and mortality in Korea, 2020. Cancer Res Treat. 2020;52(2):351–8. - PMC - PubMed
1. Huang J, Deng Y, Tin MS, Lok V, Ngai CH, Zhang L, et al. Distribution, risk factors, and temporal trends for lung cancer incidence and mortality: a global analysis. Chest. 2022;161(4):1101–11. - PubMed
1. Mathias C, Prado GF, Mascarenhas E, Ugalde PA, Zimmer Gelatti AC, Carvalho ES, et al. Lung cancer in Brazil. J Thorac Oncol. 2020;15(2):170–5. - PubMed
1. Lee JG, Kim HC, Choi CM. Recent trends of lung cancer in Korea. Tuberc Respir Dis. 2021;84(2):89–95. - PMC - PubMed
1. Gould MK, Huang BZ, Tammemagi MC, Kinar Y, Shiff R. Machine learning for early lung cancer identification using routine clinical and laboratory data. Am J Respir Crit Care Med. 2021;204(4):445–53. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer

Affiliations

Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous

Abstract

Figures

Similar articles

Cited by

References

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous