Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 15:13:1106029.
doi: 10.3389/fonc.2023.1106029. eCollection 2023.

Development and validation of machine learning models for predicting prognosis and guiding individualized postoperative chemotherapy: A real-world study of distal cholangiocarcinoma

Affiliations

Development and validation of machine learning models for predicting prognosis and guiding individualized postoperative chemotherapy: A real-world study of distal cholangiocarcinoma

Di Wang et al. Front Oncol. .

Abstract

Background: Distal cholangiocarcinoma (dCCA), originating from the common bile duct, is greatly associated with a dismal prognosis. A series of different studies based on cancer classification have been developed, aimed to optimize therapy and predict and improve prognosis. In this study, we explored and compared several novel machine learning models that might lead to an improvement in prediction accuracy and treatment options for patients with dCCA.

Methods: In this study, 169 patients with dCCA were recruited and randomly divided into the training cohort (n = 118) and the validation cohort (n = 51), and their medical records were reviewed, including survival outcomes, laboratory values, treatment strategies, pathological results, and demographic information. Variables identified as independently associated with the primary outcome by least absolute shrinkage and selection operator (LASSO) regression, the random survival forest (RSF) algorithm, and univariate and multivariate Cox regression analyses were introduced to establish the following different machine learning models and canonical regression model: support vector machine (SVM), SurvivalTree, Coxboost, RSF, DeepSurv, and Cox proportional hazards (CoxPH). We measured and compared the performance of models using the receiver operating characteristic (ROC) curve, integrated Brier score (IBS), and concordance index (C-index) following cross-validation. The machine learning model with the best performance was screened out and compared with the TNM Classification using ROC, IBS, and C-index. Finally, patients were stratified based on the model with the best performance to assess whether they benefited from postoperative chemotherapy through the log-rank test.

Results: Among medical features, five variables, including tumor differentiation, T-stage, lymph node metastasis (LNM), albumin-to-fibrinogen ratio (AFR), and carbohydrate antigen 19-9 (CA19-9), were used to develop machine learning models. In the training cohort and the validation cohort, C-index achieved 0.763 vs. 0.686 (SVM), 0.749 vs. 0.692 (SurvivalTree), 0.747 vs. 0.690 (Coxboost), 0.745 vs. 0.690 (RSF), 0.746 vs. 0.711 (DeepSurv), and 0.724 vs. 0.701 (CoxPH), respectively. The DeepSurv model (0.823 vs. 0.754) had the highest mean area under the ROC curve (AUC) than other models, including SVM (0.819 vs. 0.736), SurvivalTree (0.814 vs. 0.737), Coxboost (0.816 vs. 0.734), RSF (0.813 vs. 0.730), and CoxPH (0.788 vs. 0.753). The IBS of the DeepSurv model (0.132 vs. 0.147) was lower than that of SurvivalTree (0.135 vs. 0.236), Coxboost (0.141 vs. 0.207), RSF (0.140 vs. 0.225), and CoxPH (0.145 vs. 0.196). Results of the calibration chart and decision curve analysis (DCA) also demonstrated that DeepSurv had a satisfactory predictive performance. In addition, the performance of the DeepSurv model was better than that of the TNM Classification in C-index, mean AUC, and IBS (0.746 vs. 0.598, 0.823 vs. 0.613, and 0.132 vs. 0.186, respectively) in the training cohort. Patients were stratified and divided into high- and low-risk groups based on the DeepSurv model. In the training cohort, patients in the high-risk group would not benefit from postoperative chemotherapy (p = 0.519). In the low-risk group, patients receiving postoperative chemotherapy might have a better prognosis (p = 0.035).

Conclusions: In this study, the DeepSurv model was good at predicting prognosis and risk stratification to guide treatment options. AFR level might be a potential prognostic factor for dCCA. For the low-risk group in the DeepSurv model, patients might benefit from postoperative chemotherapy.

Keywords: AFR; DeepSurv; distal cholangiocarcinoma; individualized treatment; machine learning; post-operative chemotherapy; risk stratification.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
The results of LASSO regression analysis and the RSF plot for models. (A) LASSO coefficient profiles of the expression of 22 variables. (B) The length of the horizontal axis where each variable is located represents the variable’s contribution to the outcome. LASSO, least absolute shrinkage and selection operator; RSF, random survival forest.
Figure 2
Figure 2
The time-dependent ROC analysis in different models. The time-dependent ROC analysis in SVM, SurvivalTree, Coxboost, RSF, DeepSurv models, CoxPH, and DeepSurv had a higher mean AUC than other models in the training cohort. ROC, receiver operating characteristic; AUC, area under the ROC curve.
Figure 3
Figure 3
Calibration plots and ROC curve for the DeepSurv model. Calibration plots in (A) 1 year, (B) 2 years, and (C) 3 years in the training cohort. The ROC of 1, 2, and 3 years between the DeepSurv model in the training cohort (D) and the validation cohort (E). ROC, receiver operating characteristic.
Figure 4
Figure 4
DCA of the DeepSurv model and the individual postoperative prognostic prediction. The 1-year (B) and 2-year (C) DCA of the DeepSurv model. (C) The estimated prognosis of patients in the training cohort. The blue line represents patient 2, the yellow line represents patient 35, and the red line represents patient 46. DCA, decision curve analysis.
Figure 5
Figure 5
Kaplan–Meier survival analysis in different risk groups. There was no significant difference in prognosis for high-risk patients in the training cohort (A) and the validation cohort (C). Patients who received chemotherapy had a better prognosis than those who did not in the training cohort (B) and the validation cohort (D).

Similar articles

Cited by

References

    1. Moeini A, Haber PK, Sia D. Cell of origin in biliary tract cancers and clinical implications. JHEP Rep (2021) 3(2):100226. doi: 10.1016/j.jhepr.2021.100226 - DOI - PMC - PubMed
    1. Xu W, Yu S, Xiong J, Long J, Zheng Y, Sang X. CeRNA regulatory network-based analysis to study the roles of noncoding RNAs in the pathogenesis of intrahepatic cholangiocellular carcinoma. Aging (Albany NY). (2020) 12(2):1047–86. doi: 10.18632/aging.102634 - DOI - PMC - PubMed
    1. Brindley PJ, Bachini M, Ilyas SI, Khan SA, Loukas A, Sirica AE, et al. . Cholangiocarcinoma. Nat Rev Dis Primers. (2021) 7(1):65. doi: 10.1038/s41572-021-00300-2 - DOI - PMC - PubMed
    1. Clements O, Eliahoo J, Kim JU, Taylor-Robinson SD, Khan SA. Risk factors for intrahepatic and extrahepatic cholangiocarcinoma: A systematic review and meta-analysis. J Hepatol (2020) 72(1):95–103. doi: 10.1016/j.jhep.2019.09.007 - DOI - PubMed
    1. Valle JW, Kelley RK, Nervi B, Oh DY, Zhu AX. Biliary tract cancer. Lancet (2021) 397(10272):428–44. doi: 10.1016/S0140-6736(21)00153-7 - DOI - PubMed

LinkOut - more resources