Construction and validation of a machine learning algorithm-based predictive model for difficult colonoscopy insertion
- PMID: 40677572
- PMCID: PMC12264806
- DOI: 10.4253/wjge.v17.i7.108307
Construction and validation of a machine learning algorithm-based predictive model for difficult colonoscopy insertion
Abstract
Background: Difficulty of colonoscopy insertion (DCI) significantly affects colonoscopy effectiveness and serves as a key quality indicator. Predicting and evaluating DCI risk preoperatively is crucial for optimizing intraoperative strategies.
Aim: To evaluate the predictive performance of machine learning (ML) algorithms for DCI by comparing three modeling approaches, identify factors influencing DCI, and develop a preoperative prediction model using ML algorithms to enhance colonoscopy quality and efficiency.
Methods: This cross-sectional study enrolled 712 patients who underwent colonoscopy at a tertiary hospital between June 2020 and May 2021. Demographic data, past medical history, medication use, and psychological status were collected. The endoscopist assessed DCI using the visual analogue scale. After univariate screening, predictive models were developed using multivariable logistic regression, least absolute shrinkage and selection operator (LASSO) regression, and random forest (RF) algorithms. Model performance was evaluated based on discrimination, calibration, and decision curve analysis (DCA), and results were visualized using nomograms.
Results: A total of 712 patients (53.8% male; mean age 54.5 years ± 12.9 years) were included. Logistic regression analysis identified constipation [odds ratio (OR) = 2.254, 95% confidence interval (CI): 1.289-3.931], abdominal circumference (AC) (77.5-91.9 cm, OR = 1.895, 95%CI: 1.065-3.350; AC ≥ 92 cm, OR = 1.271, 95%CI: 0.730-2.188), and anxiety (OR = 1.071, 95%CI: 1.044-1.100) as predictive factors for DCI, validated by LASSO and RF methods. Model performance revealed training/validation sensitivities of 0.826/0.925, 0.924/0.868, and 1.000/0.981; specificities of 0.602/0.511, 0.510/0.562, and 0.977/0.526; and corresponding area under the receiver operating characteristic curves (AUCs) of 0.780 (0.737-0.823)/0.726 (0.654-0.799), 0.754 (0.710-0.798)/0.723 (0.656-0.791), and 1.000 (1.000-1.000)/0.754 (0.688-0.820), respectively. DCA indicated optimal net benefit within probability thresholds of 0-0.9 and 0.05-0.37. The RF model demonstrated superior diagnostic accuracy, reflected by perfect training sensitivity (1.000) and highest validation AUC (0.754), outperforming other methods in clinical applicability.
Conclusion: The RF-based model exhibited superior predictive accuracy for DCI compared to multivariable logistic and LASSO regression models. This approach supports individualized preoperative optimization, enhancing colonoscopy quality through targeted risk stratification.
Keywords: Colonoscopy; Difficulty of colonoscopy insertion; Least absolute shrinkage and selection operator regression; Logistic regression; Machine learning algorithms; Predictive model; Random forest.
©The Author(s) 2025. Published by Baishideng Publishing Group Inc. All rights reserved.
Conflict of interest statement
Conflict-of-interest statement: The authors have no conflict of interests with respect to the research, authorship, and/or publication of this article.
Figures






Similar articles
-
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23. Clin Orthop Relat Res. 2024. PMID: 39051924
-
Prediction of additional hospital days in patients undergoing cervical spine surgery with machine learning methods.Comput Assist Surg (Abingdon). 2024 Dec;29(1):2345066. doi: 10.1080/24699322.2024.2345066. Epub 2024 Jun 11. Comput Assist Surg (Abingdon). 2024. PMID: 38860617
-
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733. J Med Internet Res. 2025. PMID: 40418571 Free PMC article.
-
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3. Cochrane Database Syst Rev. 2022. PMID: 35593186 Free PMC article.
-
Rapid, point-of-care antigen tests for diagnosis of SARS-CoV-2 infection.Cochrane Database Syst Rev. 2022 Jul 22;7(7):CD013705. doi: 10.1002/14651858.CD013705.pub3. Cochrane Database Syst Rev. 2022. PMID: 35866452 Free PMC article.
References
-
- Mankaney G, Sutton RA, Burke CA. Colorectal cancer screening: Choosing the right test. Cleve Clin J Med. 2019;86:385–392. - PubMed
-
- American Cancer Society updates its colorectal cancer screening guideline: New recommendation is to start screening at age 45 years. Cancer. 2018;124:3631–3632. - PubMed
-
- May FP, Shaukat A. State of the Science on Quality Indicators for Colonoscopy and How to Achieve Them. Am J Gastroenterol. 2020;115:1183–1190. - PubMed
-
- Laanani M, Weill A, Carbonnel F, Pouchot J, Coste J. Incidence of and Risk Factors for Systemic Adverse Events After Screening or Primary Diagnostic Colonoscopy: A Nationwide Cohort Study. Am J Gastroenterol. 2020;115:537–547. - PubMed
-
- Xiang L, Zhan Q, Wang XF, Zhao XH, Zhou YB, An SL, Han ZL, Wang YD, Xu YZ, Li AM, Zhang YL, Liu SD. Risk factors associated with the detection and missed diagnosis of colorectal flat adenoma: a Chinese multicenter observational study. Scand J Gastroenterol. 2018;53:1519–1525. - PubMed
LinkOut - more resources
Full Text Sources
Miscellaneous