. 2025 Nov 14;31(42):112180.

doi: 10.3748/wjg.v31.i42.112180.

Predicting chemotherapy-induced myelosuppression in colorectal cancer: An interpretable, machine learning-based nomogram

Yu-Ming Liu¹, Yan-Yuan Du¹, Ying Song¹, Hong-Tai Xiong¹, Hui-Bo Yu², Bai-Hui Li², Liu Cai¹, Su-Su Ma¹, Jin Gao², Han-Yue Zhang¹, Rui-Ying Fang¹, Rui Cai³, Hong-Gang Zheng⁴

Affiliations

¹ Department of Oncology, Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing 100053, China.
² Beijing University of Chinese Medicine, Beijing 100029, China.
³ China-Japan Friendship Hospital, Beijing 100029, China.
⁴ Department of Oncology, Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing 100053, China. honggangzheng@126.com.

PMID: 41278162
PMCID: PMC12635783
DOI: 10.3748/wjg.v31.i42.112180

Predicting chemotherapy-induced myelosuppression in colorectal cancer: An interpretable, machine learning-based nomogram

Yu-Ming Liu et al. World J Gastroenterol. 2025.

. 2025 Nov 14;31(42):112180.

doi: 10.3748/wjg.v31.i42.112180.

Authors

Yu-Ming Liu¹, Yan-Yuan Du¹, Ying Song¹, Hong-Tai Xiong¹, Hui-Bo Yu², Bai-Hui Li², Liu Cai¹, Su-Su Ma¹, Jin Gao², Han-Yue Zhang¹, Rui-Ying Fang¹, Rui Cai³, Hong-Gang Zheng⁴

Affiliations

¹ Department of Oncology, Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing 100053, China.
² Beijing University of Chinese Medicine, Beijing 100029, China.
³ China-Japan Friendship Hospital, Beijing 100029, China.
⁴ Department of Oncology, Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing 100053, China. honggangzheng@126.com.

PMID: 41278162
PMCID: PMC12635783
DOI: 10.3748/wjg.v31.i42.112180

Abstract

Background: Colorectal cancer is a common digestive malignancy, and chemotherapy remains a cornerstone of treatment. Myelosuppression, a frequent hematologic toxicity, poses significant clinical challenges. However, no interpretable machine learning-based nomogram exists to predict chemotherapy-induced myelosuppression in colorectal cancer patients. This study aimed to develop and validate an interpretable clinic-machine learning nomogram integrating clinical predictors with multiple algorithms via a feature mapping algorithm. The model provides accurate risk estimation and clinical interpretability, supporting individualized prevention strategies and optimizing decision-making in patients receiving first-line chemotherapy.

Aim: To develop and validate an interpretable clinic-machine learning nomogram predicting chemotherapy-induced myelosuppression in colorectal cancer.

Methods: This retrospective study enrolled 855 colorectal cancer patients receiving first-line chemotherapy. Data were split into training (n = 612), validation (n = 153), and testing (n = 90) cohorts. Ten predictors were identified through least absolute shrinkage and selection operator, decision tree, random forest, and expert consensus. Ten machine learning algorithms were applied, with performance assessed by area under the receiver operating characteristic curve (AUC), area under the precision-recall curve (AUPRC), calibration, and decision curves. The optimal model was integrated into a clinic-machine learning nomogram via the feature mapping algorithm, which was internally validated for predictive accuracy and clinical utility.

Results: A total of 855 colorectal cancer patients were enrolled, with 765 cases (April 2020 to December 2023) used for model training and validation, and 90 cases (January 2024 to July 2024) for internal testing. Baseline clinical features did not differ significantly between training and validation cohorts (P > 0.05). Ten predictors were identified through integrated feature selection and expert consensus, including age, body surface area, body mass index, tumor position, albumin, carcinoembryonic antigen, carbohydrate antigen (CA) 19-9, CA125, chemotherapy regimen, and chemotherapy cycles. Among ten machine learning algorithms, extreme gradient boosting achieved the best validation performance (AUC = 0.97, AUPRC = 0.92, sensitivity = 0.79, specificity = 0.92, accuracy = 0.88). Logistic regression confirmed extra trees and random forest as independent predictors, which were incorporated into a clinic-machine learning nomogram. The clinic-machine learning nomogram demonstrated superior discrimination (AUC = 0.96, AUPRC = 0.93, accuracy = 0.90, specificity = 0.95), good calibration, and greater net clinical benefit across a wide probability range (10%-90%). Internal testing further confirmed its robustness and generalizability (AUC = 0.95).

Conclusion: The clinic-machine learning nomogram accurately predicts chemotherapy-induced myelosuppression in colorectal cancer, providing interpretability and clinical utility to support individualized risk assessment and treatment decision-making.

Keywords: Chemotherapy-induced myelosuppression; Colorectal cancer; Machine learning; Nomogram; Risk factors.

PubMed Disclaimer

Conflict of interest statement

Conflict-of-interest statement: The authors declare that they have no conflict of interest.

Figures

**Figure 1**
**Flowchart of the study protocol.** CRC: Colorectal cancer; T: Tumor; N: Node; M: Metastasis; HM: Hepatic metastasis; LM: Lung metastasis; PM: Peritoneal metastasis; BSA: Body surface area; BMI: Body mass index; ALB: Albumin; CEA: Carcinoembryonic antigen; CA: Carbohydrate antigen; LASSO: Least absolute shrinkage and selection operator; ML: Machine learning; LR: Logistic regression; DT: Decision trees; RF: Random forest; XGBoost: Extreme gradient boosting; SVM: Support vector machines; GBM: Gradient boosting machines; KNN: K-Nearest neighbors; ANN: Artificial neural network; ET: Extreme trees; ROC: Receiver operating characteristic; AUC: Area under the curve; PR: Precision-recall; AUPRC: Area under the precision-recall curve; PPV: Positive predictive value; NPV: Negative predictive value.

**Figure 2**
**Sample size calculation flowchart.** rMPSE: Root mean squared prediction error; MPSE: Mean squared prediction error; EPV: Events per variable. In formula: Ø: Events fraction; δ: A margin of error, generally recommend < 0.05; P: Number of candidate predictors; S: Shrinkage factor; R²_cs: A (conservative) value for the anticipated model performance is required, as defined by the Cox-Snell R squared statistic; *MAPE*: The mean absolute prediction error; n: The sample size.

**Figure 3**
**Candidate predictor screening using least absolute shrinkage and selection operator.** A: Path diagram of least absolute shrinkage and selection operator (LASSO) regression coefficients for candidate predictors; B: Cross-validation curves for LASSO. MSE: Mean squared error.

**Figure 4**
**Mean importance of candidate predictors.** A: Random forest algorithm; B: Decision trees algorithm. BSA: Body surface area; BMI: Body mass index; T: Tumor; N: Node; M: Metastasis; HM: Hepatic metastasis; LM: Lung metastasis; PM: Peritoneal metastasis; ALB: Albumin; CEA: Carcinoembryonic antigen; CA: Carbohydrate antigen.

**Figure 5**
10-fold cross-validation plot.

**Figure 6**
**Curves for 10 machine learnings.** A and B: Receiver operating characteristic curves of training set (A) and validation set (B); C and D: Precision-recall curves of training set (C) and validation set (D). LR: Logistic regression; DT: Decision trees; RF: Random forest; XGBoost: Extreme gradient boosting; SVM: Support vector machines; GBM: Gradient boosting machines; KNN: K-Nearest neighbors; ANN: Artificial neural network; ET: Extreme trees; AUC: Area under the curve; AP: Average precision.

**Figure 7**
**The nomogram for predicting myelosuppression induced by first-line chemotherapy in colorectal cancer.** A: Clinic-machine learning; B: Clinic. BSA: Body surface area; BMI: Body mass index; ALB: Albumin; CEA: Carcinoembryonic antigen; CA: Carbohydrate antigen.

**Figure 8**
**Receiver operating characteristic curves for extreme gradient boosting, clinic nomogram and clinic-machine learning nomogram.** A: Training set; B: Validation set. XGBoost: Extreme gradient boosting; AUC: Area under the curve; ML: Machine learning.

**Figure 9**
**Precision-recall curve for extreme gradient boosting, clinic nomogram and clinic-machine learning nomogram.** A: Training set; B: Validation set. XGBoost: Extreme gradient boosting; AP: Average precision; ML: Machine learning.

**Figure 10**
**Calibration curves for extreme gradient boosting, clinic nomogram and clinic-machine learning nomogram.** A: Training set; B: Validation set. XGBoost: Extreme gradient boosting; ML: Machine learning.

**Figure 11**
**Decision curve analysis for extreme gradient boosting, clinic nomogram and clinic-machine learning nomogram.** A: Training set; B: Validation set. XGBoost: Extreme gradient boosting; ML: Machine learning.

**Figure 12**
**Receiver operating characteristic, precision-recall curve, calibration curves and decision curve analysis for the optimal prediction model clinic-machine learning nomogram (testing set).** A: Receiver operating characteristic curve; B: Precision-recall curve; C: Calibration curve; D: Decision curve analysis. AUC: Area under the curve; ML: Machine learning; AUPRC: Area under the precision-recall curve; CI: Confidence interval.

See this image and copyright information in PMC

References

1. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2024;74:229–263. - PubMed
1. Wang F, Chen G, Zhang Z, Yuan Y, Wang Y, Gao YH, Sheng W, Wang Z, Li X, Yuan X, Cai S, Ren L, Liu Y, Xu J, Zhang Y, Liang H, Wang X, Zhou A, Ying J, Li G, Cai M, Ji G, Li T, Wang J, Hu H, Nan K, Wang L, Zhang S, Li J, Xu RH. The Chinese Society of Clinical Oncology (CSCO): Clinical guidelines for the diagnosis and treatment of colorectal cancer, 2024 update. Cancer Commun (Lond) 2025;45:332–379. - PMC - PubMed
1. Aoullay Z, Slaoui M, Razine R, Er-Raki A, Meddah B, Cherrah Y. Therapeutic Characteristics, Chemotherapy-Related Toxicities and Survivorship in Colorectal Cancer Patients. Ethiop J Health Sci. 2020;30:65–74. - PMC - PubMed
1. Barreto JN, McCullough KB, Ice LL, Smith JA. Antineoplastic agents and the associated myelosuppressive effects: a review. J Pharm Pract. 2014;27:440–446. - PubMed
1. Kuter DJ. Managing thrombocytopenia associated with cancer chemotherapy. Oncology (Williston Park) 2015;29:282–294. - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
- Baishideng Publishing Group Inc.
- PubMed Central
Medical
- MedlinePlus Health Information
Research Materials
- NCI CPTC Antibody Characterization Program
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Predicting chemotherapy-induced myelosuppression in colorectal cancer: An interpretable, machine learning-based nomogram

Affiliations

Predicting chemotherapy-induced myelosuppression in colorectal cancer: An interpretable, machine learning-based nomogram

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Medical

Research Materials

Miscellaneous