. 2023 Sep 8:2:e44909.

doi: 10.2196/44909.

Machine Learning for the Prediction of Procedural Case Durations Developed Using a Large Multicenter Database: Algorithm Development and Validation Study

Samir Kendale^#¹, Andrew Bishara^#^{2

3}, Michael Burns^#⁴, Stuart Solomon⁵, Matthew Corriere^#⁶, Michael Mathis^#^{4

7}

Affiliations

¹ Department of Anesthesia, Critical Care & Pain Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States.
² Department of Anesthesia and Perioperative Care, University of California, San Francisco, San Francisco, CA, United States.
³ Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, United States.
⁴ Department of Anesthesiology, University of Michigan Medical School, Ann Arbor, MI, United States.
⁵ Department of Anesthesiology, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States.
⁶ Department of Surgery, Section of Vascular Surgery, University of Michigan Medical School, Ann Arbor, MI, United States.
⁷ Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, United States.

^# Contributed equally.

PMID: 38875567
PMCID: PMC11041482
DOI: 10.2196/44909

Machine Learning for the Prediction of Procedural Case Durations Developed Using a Large Multicenter Database: Algorithm Development and Validation Study

Samir Kendale et al. JMIR AI. 2023.

. 2023 Sep 8:2:e44909.

doi: 10.2196/44909.

Authors

Samir Kendale^#¹, Andrew Bishara^#^{2

3}, Michael Burns^#⁴, Stuart Solomon⁵, Matthew Corriere^#⁶, Michael Mathis^#^{4

7}

Affiliations

¹ Department of Anesthesia, Critical Care & Pain Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States.
² Department of Anesthesia and Perioperative Care, University of California, San Francisco, San Francisco, CA, United States.
³ Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, United States.
⁴ Department of Anesthesiology, University of Michigan Medical School, Ann Arbor, MI, United States.
⁵ Department of Anesthesiology, The University of Texas Health Science Center at San Antonio, San Antonio, TX, United States.
⁶ Department of Surgery, Section of Vascular Surgery, University of Michigan Medical School, Ann Arbor, MI, United States.
⁷ Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, United States.

^# Contributed equally.

PMID: 38875567
PMCID: PMC11041482
DOI: 10.2196/44909

Abstract

Background: Accurate projections of procedural case durations are complex but critical to the planning of perioperative staffing, operating room resources, and patient communication. Nonlinear prediction models using machine learning methods may provide opportunities for hospitals to improve upon current estimates of procedure duration.

Objective: The aim of this study was to determine whether a machine learning algorithm scalable across multiple centers could make estimations of case duration within a tolerance limit because there are substantial resources required for operating room functioning that relate to case duration.

Methods: Deep learning, gradient boosting, and ensemble machine learning models were generated using perioperative data available at 3 distinct time points: the time of scheduling, the time of patient arrival to the operating or procedure room (primary model), and the time of surgical incision or procedure start. The primary outcome was procedure duration, defined by the time between the arrival and the departure of the patient from the procedure room. Model performance was assessed by mean absolute error (MAE), the proportion of predictions falling within 20% of the actual duration, and other standard metrics. Performance was compared with a baseline method of historical means within a linear regression model. Model features driving predictions were assessed using Shapley additive explanations values and permutation feature importance.

Results: A total of 1,177,893 procedures from 13 academic and private hospitals between 2016 and 2019 were used. Across all procedures, the median procedure duration was 94 (IQR 50-167) minutes. In estimating the procedure duration, the gradient boosting machine was the best-performing model, demonstrating an MAE of 34 (SD 47) minutes, with 46% of the predictions falling within 20% of the actual duration in the test data set. This represented a statistically and clinically significant improvement in predictions compared with a baseline linear regression model (MAE 43 min; P<.001; 39% of the predictions falling within 20% of the actual duration). The most important features in model training were historical procedure duration by surgeon, the word "free" within the procedure text, and the time of day.

Conclusions: Nonlinear models using machine learning techniques may be used to generate high-performing, automatable, explainable, and scalable prediction models for procedure duration.

Keywords: AI; OR management; algorithm development; artificial intelligence; machine learning; medical informatics; operating room; patient communication; perioperative; prediction model; surgical procedure; validation.

©Samir Kendale, Andrew Bishara, Michael Burns, Stuart Solomon, Matthew Corriere, Michael Mathis. Originally published in JMIR AI (https://ai.jmir.org), 08.09.2023.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: AB is a co-founder of Bezel Health, a company building software to measure and improve healthcare quality interventions. SS is a co-founder of Orchestra Health Inc, a digital health startup company improving care transitions. This is unrelated to the work in this study.

Figures

**Figure 1**
Study inclusion and exclusion criteria and machine learning model training and validation and testing schematic.

**Figure 2**
Patient-in-room duration plotted against prediction error. (A) Time of Patient in OR [Operating Room] model (primary model). (B) Time of Scheduling model (secondary model). (C) Time of Surgical Incision model (secondary model).

**Figure 3**
Shapley additive explanations (SHAP) global summary dot plots. (A) Time of Patient in OR [Operating Room] model (primary model). (B) Time of Scheduling model (secondary model). (C) Time of Surgical Incision model (secondary model). The feature ranking (y-axis) implies the order of importance of the feature. The SHAP value (x-axis) is a unified index reflecting the impact of a feature on the model output. In each feature importance row, the attributions of all cases to the outcome were plotted using different colored dots, of which the redder dots represent a higher (or positive, if binary) value, and the bluer dots represent a low (or negative, if binary) value, along a gradient from red to blue. ASA: American Society of Anesthesiologists; CPT: current procedural terminology; INR: international normalized ratio.

**Figure 4**
Sample output, including Shapley additive explanations (SHAP) local plot. A positive SHAP value contribution indicates that a feature increased the prediction above the average value, whereas a negative SHAP value contribution indicates that a feature decreased the prediction below the average value.

See this image and copyright information in PMC

References

1. Glance LG, Dutton RP, Feng C, Li Y, Lustik SJ, Dick AW. Variability in case durations for common surgical procedures. Anesth Analg. 2018 Jun;126(6):2017–24. doi: 10.1213/ANE.0000000000002882. - DOI - PubMed
1. Levine WC, Dunn PF. Optimizing operating room scheduling. Anesthesiol Clin. 2015 Dec;33(4):697–711. doi: 10.1016/j.anclin.2015.07.006.S1932-2275(15)00071-3 - DOI - PubMed
1. Wu A, Huang C-C, Weaver MJ, Urman RD. Use of historical surgical times to predict duration of primary total knee arthroplasty. J Arthroplasty. 2016 Dec;31(12):2768–72. doi: 10.1016/j.arth.2016.05.038.S0883-5403(16)30217-0 - DOI - PubMed
1. Dexter F, Ledolter J, Tiwari V, Epstein RH. Value of a scheduled duration quantified in terms of equivalent numbers of historical cases. Anesth Analg. 2013 Jul;117(1):205–10. doi: 10.1213/ANE.0b013e318291d388.ANE.0b013e318291d388 - DOI - PubMed
1. Edelman ER, van Kuijk SM, Hamaekers AE, de Korte MJ, van Merode GG, Buhre WF. Improving the prediction of total surgical procedure time using linear regression modeling. Front Med (Lausanne) 2017 Jun 19;4:85. doi: 10.3389/fmed.2017.00085. https://europepmc.org/abstract/MED/28674693 - DOI - PMC - PubMed

Grants and funding

T32 GM008440/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Machine Learning for the Prediction of Procedural Case Durations Developed Using a Large Multicenter Database: Algorithm Development and Validation Study

Affiliations

Machine Learning for the Prediction of Procedural Case Durations Developed Using a Large Multicenter Database: Algorithm Development and Validation Study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources