Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 24:13:1231508.
doi: 10.3389/fonc.2023.1231508. eCollection 2023.

Construction and validation of a progression prediction model for locally advanced rectal cancer patients received neoadjuvant chemoradiotherapy followed by total mesorectal excision based on machine learning

Affiliations

Construction and validation of a progression prediction model for locally advanced rectal cancer patients received neoadjuvant chemoradiotherapy followed by total mesorectal excision based on machine learning

Jitao Hu et al. Front Oncol. .

Abstract

Background: We attempted to develop a progression prediction model for local advanced rectal cancer(LARC) patients who received preoperative neoadjuvant chemoradiotherapy(NCRT) and operative treatment to identify high-risk patients in advance.

Methods: Data from 272 LARC patients who received NCRT and total mesorectal excision(TME) from 2011 to 2018 at the Fourth Hospital of Hebei Medical University were collected. Data from 161 patients with rectal cancer (each sample with one target variable (progression) and 145 characteristic variables) were included. One Hot Encoding was applied to numerically represent some characteristics. The K-Nearest Neighbor (KNN) filling method was used to determine the missing values, and SmoteTomek comprehensive sampling was used to solve the data imbalance. Eventually, data from 135 patients with 45 characteristic clinical variables were obtained. Random forest, decision tree, support vector machine (SVM), and XGBoost were used to predict whether patients with rectal cancer will exhibit progression. LASSO regression was used to further filter the variables and narrow down the list of variables using a Venn diagram. Eventually, the prediction model was constructed by multivariate logistic regression, and the performance of the model was confirmed in the validation set.

Results: Eventually, data from 135 patients including 45 clinical characteristic variables were included in the study. Data were randomly divided in an 8:2 ratio into a data set and a validation set, respectively. Area Under Curve (AUC) values of 0.72 for the decision tree, 0.97 for the random forest, 0.89 for SVM, and 0.94 for XGBoost were obtained from the data set. Similar results were obtained from the validation set. Twenty-three variables were obtained from LASSO regression, and eight variables were obtained by considering the intersection of the variables obtained using the previous four machine learning methods. Furthermore, a multivariate logistic regression model was constructed using the data set; the ROC indicated its good performance. The ROC curve also verified the good predictive performance in the validation set.

Conclusions: We constructed a logistic regression model with good predictive performance, which allowed us to accurately predict whether patients who received NCRT and TME will exhibit disease progression.

Keywords: artificial intelligence; deep learning; local advanced rectal cancer; neoadjuvant chemoradiotherapy; total mesorectal excision.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Experimental flow chart. (A) Data process, 135 patients were obtained. (B) Machine learning model construction and validation. (C) Construction and validation of predictive models.
Figure 2
Figure 2
Machine learning model construction (ten-fold cross-validation) in the training set. (A) ROC diagram of the decision tree in the training set. (B) ROC diagram of the random forest in the training set. (C): ROC diagram of the support vector machine in the training set. (D) ROC diagram of XGBoost in the training set.
Figure 3
Figure 3
Machine learning model validation in the validation set. (A) ROC diagram of the decision tree in the validation set. (B) ROC diagram of the random forest in the validation set. (C) ROC diagram of support vector machine in the validation set. D) ROC diagram of XGBoost in the validation set.
Figure 4
Figure 4
Predictor construction and validation. (A) Clinical characteristics of patients with rectal cancer in the LASSO model. (B) Selection of the tuning parameter (λ) in the LASSO model required cross-validation using the maximum criteria. (C) Venn diagram of the outcomes of the four machine learning methods for filtering variables. (D) Confusion matrix of binary outcomes after logistic regression for predicting patient progression in rectal cancer, the predictor for the train set (upper) and test set (lower). (E) ROC curves for predicting disease progression in patients with rectal cancer undergoing preoperative neoadjuvant therapy and after surgical treatment to distinguish whether progression; the training set. (F) ROC curves for predicting disease progression in patients with rectal cancer undergoing preoperative neoadjuvant therapy and after surgical treatment to distinguish whether progression; the test set.

References

    1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. . Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J Clin (2021) 71(3):209–49. doi: 10.3322/caac.21660 - DOI - PubMed
    1. Sauer R, Becker H, Hohenberger W, Rödel C, Wittekind C, Fietkau R, et al. . Preoperative versus postoperative chemoradiotherapy for rectal cancer. N Engl J Med (2004) 351(17):1731–40. doi: 10.1056/NEJMoa040694 - DOI - PubMed
    1. Pohl M, Schmiegel W. Therapeutic strategies in diseases of the digestive tract - 2015 and beyond targeted therapies in colon cancer today and tomorrow. Dig Dis (2016) 34(5):574–9. doi: 10.1159/000445267 - DOI - PubMed
    1. Peng SH, Mbarak HS, Li YH, Ma C, Shang QL, Chen Z, et al. . Neoadjuvant intra-arterial versus intravenous chemotherapy in colorectal cancer. Med (Baltimore) (2021) 100(51):e28312. doi: 10.1097/MD.0000000000028312 - DOI - PMC - PubMed
    1. Subbiah IM, Blackmon SH, Correa AM, Kee B, Vaporciyan AA, Swisher SG, et al. . Preoperative chemotherapy prior to pulmonary metastasectomy in surgically resected primary colorectal carcinoma. Oncotarget (2014) 5(16):6584–93. doi: 10.18632/oncotarget.2172 - DOI - PMC - PubMed

LinkOut - more resources