Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec;13(23):3353-3361.
doi: 10.1111/1759-7714.14694. Epub 2022 Oct 24.

Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer

Affiliations

Application of several machine learning algorithms for the prediction of afatinib treatment outcome in advanced-stage EGFR-mutated non-small-cell lung cancer

Taeyun Kim et al. Thorac Cancer. 2022 Dec.

Abstract

Background: The present study aimed to evaluate the performance of several machine learning (ML) algorithms in predicting 1-year afatinib continuation and 2-year survival after afatinib initiation and to identify the differences in survival outcomes between ML-classified strata.

Methods: Data that were also used in the RESET study were retrospectively collected from 16 hospitals in South Korea. A stratified random sampling method was applied to split the data into training and test sets (70:30 split ratio). Clinical information, such as age, sex, tumor stage, smoking, performance status, metastasis, type of metastasis, dose adjustment, and pathologic information on EGFR mutations were inputted. Training was performed using eight ML algorithms: logistic regression, decision tree, deep neural network, random forest, support vector machine, boosting, bagging, and the naïve Bayes classifier. The model performance was assessed based on sensitivity, specificity, and accuracy. Area under the receiver operator characteristic curve (AUC) was calculated and compared between the ML models using DeLong's test. A Kaplan-Meier (KM) curve was used to visualize the identified strata obtained from the ML models.

Results: No significant differences in the input variables were observed between the training and test datasets. The best-performing models were support vector machine in predicting 1-year afatinib continuation (AUC 0.626) and decision tree in 2-year survival after afatinib start (AUC 0.644), although the performances of the ML models were comparable and did not display any predictive roles. KM analysis and log-rank test revealed significant differences between the strata identified from the ML model (p < 0.001) in terms of both time-on-treatment (TOT) and overall survival (OS).

Conclusion: The performances of ML models in our study found no discernible roles in predicting afatinib-related outcomes, although the identified strata revealed different TOT and OS in the KM analysis. This implies the strength of ML in predicting the survival outcome, as well as the limitation of electronic medical record-based variables in ML algorithms. Careful consideration of variable inclusion is likely to improve the general model performance.

Keywords: NSCLC; machine learning; outcome; survival.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
ROC curves for the prediction of 1‐year afatinib continuation (a) and 2‐year survival after afatinib initiation (b)
FIGURE 2
FIGURE 2
The Kaplan–Meier curve for the time‐on‐treatment (TOT) according to the strata identified using several machine learning algorithms. The dotted lines indicate the time at which the probability drops to 0.5
FIGURE 3
FIGURE 3
The Kaplan–Meier curve for the overall survival (OS) according to the strata identified using several machine learning algorithms. The dotted lines indicate the time at which the probability drops to 0.5

Similar articles

Cited by

References

    1. Jung KW, Won YJ, Hong S, Kong HJ, Lee ES. Prediction of cancer incidence and mortality in Korea, 2020. Cancer Res Treat. 2020;52(2):351–8. - PMC - PubMed
    1. Huang J, Deng Y, Tin MS, Lok V, Ngai CH, Zhang L, et al. Distribution, risk factors, and temporal trends for lung cancer incidence and mortality: a global analysis. Chest. 2022;161(4):1101–11. - PubMed
    1. Mathias C, Prado GF, Mascarenhas E, Ugalde PA, Zimmer Gelatti AC, Carvalho ES, et al. Lung cancer in Brazil. J Thorac Oncol. 2020;15(2):170–5. - PubMed
    1. Lee JG, Kim HC, Choi CM. Recent trends of lung cancer in Korea. Tuberc Respir Dis. 2021;84(2):89–95. - PMC - PubMed
    1. Gould MK, Huang BZ, Tammemagi MC, Kinar Y, Shiff R. Machine learning for early lung cancer identification using routine clinical and laboratory data. Am J Respir Crit Care Med. 2021;204(4):445–53. - PubMed