Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 5:17:1563-1575.
doi: 10.2147/CMAR.S517949. eCollection 2025.

Development and Cross-Institutional Validation of a Comprehensive Machine Learning Model Predicting Response to Neoadjuvant Therapy for Rectal Cancer

Affiliations

Development and Cross-Institutional Validation of a Comprehensive Machine Learning Model Predicting Response to Neoadjuvant Therapy for Rectal Cancer

Sha Li et al. Cancer Manag Res. .

Abstract

Objective: Accurately identifying patients achieving pathological complete response (pCR) after neoadjuvant chemoradiotherapy (nCRT) for locally advanced rectal cancer (LARC) not only ensures treatment efficacy but also helps avoid surgical risks. We developed a comprehensive multi-omics model to predict pCR before surgery.

Methods: Clinical data, CT, MRI-T1WI and MRI-T2WI, and radiotherapy dose were collected from 183 LARC patients who underwent preoperative nCRT. Backward stepwise selection, logistic regression, and five-fold cross-validation were employed for the development and validation of a non-imaging model, three radiomics-based models and a dosiomics-based model. These were integrated into a final model, and its performance was tested on multi-center sets.

Results: C_model, based on clinical characteristics, achieved an AUC of 0.85 in the validation set. Radiomics models (CT_model, T1_model, T2_model) exhibited AUCs of 0.66, 0.67, and 0.64, respectively. Dosiomics-based model, D_model, achieved an AUC of 0.75 in validation. The mean AUCs for F_model in the training sets, validation sets, internal and external test sets were 0.90, 0.88, 0.77, and 0.74, respectively.

Conclusion: To assess the efficacy of nCRT in LARC patients, it is crucial to consider clinical characteristics, followed by dosiomics. While T1_model, T2_model and CT_model demonstrate relatively comparable performance, each contributes unique value to the final prediction model.

Keywords: dosiomics; nCRT; predict therapy response; radiomics; rectal cancer.

PubMed Disclaimer

Conflict of interest statement

The authors report no conflicts of interest in this work.

Figures

Figure 1
Figure 1
Study workflow and model building. A clinical features-based model (C_model), three radiomics-based models (CT_model, T1_model and T2_model) and a dosiomics-based model (D_model) were integrated into the final model and its performance was verified.
Figure 2
Figure 2
Feature contributions of F_model.
Figure 3
Figure 3
ROC curves for C_model, CT_model, T1_model, T2_model, D_model and F_model. (a) ROC curve for the C_model in the validation set, showing a mean AUC of 0.85 (SD 0.10). (b) ROC curve for the CT_model in the validation set, showing a mean AUC of 0.66 (SD 0.10). (c) ROC curve for the T1_model in the validation set, showing a mean AUC of 0.67 (SD 0.09). (d) ROC curve for the T2_model in the validation set, showing a mean AUC of 0.64 (SD 0.08). (e) ROC curve for the D_model in the validation set, showing a mean AUC of 0.75 (SD 0.11). (f) ROC curve for the F_model in the validation set, showing a mean AUC of 0.88 (SD 0.09). Receiver-operating characteristic, ROC; clinical characteristics, C_model; CT-based radiomics, CT_model; T1WI-based radiomics, T1_model; T2WI-based radiomics, T2_model; Dose-based dosiomics, D_model; clinical characteristics combined with radiomics and dosiomics, F_model.
Figure 4
Figure 4
The confusion matrixes of the C_model (a), CT_model (b), T1_model (c), T2_model (d), D_model (e) and F_model (f).
Figure 5
Figure 5
ROC curves of the internal and external test sets for C_model, CT_model, T1_model, T2_model, D_model and F_model. The solid blue line represents the internal test results and the Orange is the external test results. (a) Testing with C_model; (b) Testing with CT_model; (c) Testing with T1_model; (d) Testing with T2_model; (e) Testing with D_model; (f) Testing with F_model.
Figure 6
Figure 6
The confusion matrixes of the internal test set and external test set. (af) represents the results of tests using C_model, CT_model, T1_model, T2_model, D_model and F_model, respectively. (−1) and (−2) indicate internal and external test, respectively.

Similar articles

References

    1. Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394‐424. doi: 10.3322/caac.21492 - DOI - PubMed
    1. Bailey CE, Hu CY, You YN, et al. Increasing disparities in the age‐ related incidences of colon and rectal cancers in the United States. JAMA Surg. 2015;150(1):17‐22. doi: 10.1001/jamasurg.2014.1756 - DOI - PMC - PubMed
    1. Hyuna S, Jacques F, Rebecca LS, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249. doi: 10.3322/caac.21660 - DOI - PubMed
    1. Siegel RL, Miller KD, Fedewa SA, et al. Colorectal cancer statistics, 2017. CA Cancer J Clin. 2017;67(3):177–193. doi: 10.3322/caac.21395 - DOI - PubMed
    1. Naohiro T, Hideyuki I, Kohji T, et al. Japanese society for cancer of the colon and rectum (JSCCR) guidelines 2020 for the clinical practice of hereditary colorectal cancer. Int J Clin Oncol. 2021;26(8):1353‐1419. doi: 10.1007/s10147-021-01881-4 - DOI - PMC - PubMed

LinkOut - more resources