Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Randomized Controlled Trial
. 2022 Feb 1;5(2):e2149008.
doi: 10.1001/jamanetworkopen.2021.49008.

Effect of Artificial Intelligence Tutoring vs Expert Instruction on Learning Simulated Surgical Skills Among Medical Students: A Randomized Clinical Trial

Affiliations
Randomized Controlled Trial

Effect of Artificial Intelligence Tutoring vs Expert Instruction on Learning Simulated Surgical Skills Among Medical Students: A Randomized Clinical Trial

Ali M Fazlollahi et al. JAMA Netw Open. .

Abstract

Importance: To better understand the emerging role of artificial intelligence (AI) in surgical training, efficacy of AI tutoring systems, such as the Virtual Operative Assistant (VOA), must be tested and compared with conventional approaches.

Objective: To determine how VOA and remote expert instruction compare in learners' skill acquisition, affective, and cognitive outcomes during surgical simulation training.

Design, setting, and participants: This instructor-blinded randomized clinical trial included medical students (undergraduate years 0-2) from 4 institutions in Canada during a single simulation training at McGill Neurosurgical Simulation and Artificial Intelligence Learning Centre, Montreal, Canada. Cross-sectional data were collected from January to April 2021. Analysis was conducted based on intention-to-treat. Data were analyzed from April to June 2021.

Interventions: The interventions included 5 feedback sessions, 5 minutes each, during a single 75-minute training, including 5 practice sessions followed by 1 realistic virtual reality brain tumor resection. The 3 intervention arms included 2 treatment groups, AI audiovisual metric-based feedback (VOA group) and synchronous verbal scripted debriefing and instruction from a remote expert (instructor group), and a control group that received no feedback.

Main outcomes and measures: The coprimary outcomes were change in procedural performance, quantified as Expertise Score by a validated assessment algorithm (Intelligent Continuous Expertise Monitoring System [ICEMS]; range, -1.00 to 1.00) for each practice resection, and learning and retention, measured from performance in realistic resections by ICEMS and blinded Objective Structured Assessment of Technical Skills (OSATS; range 1-7). Secondary outcomes included strength of emotions before, during, and after the intervention and cognitive load after intervention, measured in self-reports.

Results: A total of 70 medical students (41 [59%] women and 29 [41%] men; mean [SD] age, 21.8 [2.3] years) from 4 institutions were randomized, including 23 students in the VOA group, 24 students in the instructor group, and 23 students in the control group. All participants were included in the final analysis. ICEMS assessed 350 practice resections, and ICEMS and OSATS evaluated 70 realistic resections. VOA significantly improved practice Expertise Scores by 0.66 (95% CI, 0.55 to 0.77) points compared with the instructor group and by 0.65 (95% CI, 0.54 to 0.77) points compared with the control group (P < .001). Realistic Expertise Scores were significantly higher for the VOA group compared with instructor (mean difference, 0.53 [95% CI, 0.40 to 0.67] points; P < .001) and control (mean difference. 0.49 [95% CI, 0.34 to 0.61] points; P < .001) groups. Mean global OSATS ratings were not statistically significant among the VOA (4.63 [95% CI, 4.06 to 5.20] points), instructor (4.40 [95% CI, 3.88-4.91] points), and control (3.86 [95% CI, 3.44 to 4.27] points) groups. However, on the OSATS subscores, VOA significantly enhanced the mean OSATS overall subscore compared with the control group (mean difference, 1.04 [95% CI, 0.13 to 1.96] points; P = .02), whereas expert instruction significantly improved OSATS subscores for instrument handling vs control (mean difference, 1.18 [95% CI, 0.22 to 2.14]; P = .01). No significant differences in cognitive load, positive activating, and negative emotions were found.

Conclusions and relevance: In this randomized clinical trial, VOA feedback demonstrated superior performance outcome and skill transfer, with equivalent OSATS ratings and cognitive and emotional responses compared with remote expert instruction, indicating advantages for its use in simulation training.

Trial registration: ClinicalTrials.gov Identifier: NCT04700384.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest Disclosures: Mr Mirchi, Dr Yilmaz, Dr Winkler-Schwartz, Ms Ledwos, and Dr Del Maestro have a US patent for “A Framework For Transparent Artificial Intelligence In Simulation: The Virtual Operative Assistant” application No. PCT/CA2020/050353, international patent No. WO 2020/186348. Dr Mirchi reported receiving grants from Di Giovanni Foundation outside the submitted work. No other disclosures were reported.

Figures

Figure 1.
Figure 1.. Participant Recruitment Flowchart
Figure 2.
Figure 2.. Performance Assessment in the Practice Tumor Resections
A, Negative scores indicate a novice; and a positive score, a more expert performance. Scores in each trial are the mean of all estimations made for every 200 milliseconds of the simulated procedure (approximately 1500 predictions for a 5-minute practice scenario). B, Maximum bipolar force application is a recording of the highest amount of force applied with the bipolar during the entire operation. C, Mean instrument tip separation distance measured as the mean distance between the aspirator and the bipolar tips. D, Mean bipolar acceleration measured as the rate of change in the bipolar instrument’s velocity. Error bars indicate 95% CIs; and VOA, Virtual Operative Assistant.
Figure 3.
Figure 3.. Performance Assessment in the Realistic Tumor Resection
Error bars indicate 95% CIs; OSATS, Objective Structured Assessment of Technical Skills; and VOA, Virtual Operative Assistant.
Figure 4.
Figure 4.. Emotions and Cognitive Load Throughout the Simulation Training
Positive activating emotions include happy, hopeful, grateful (A), and negative activating emotions include confusion and anxiety (B). Error bars indicate 95% CIs; and VOA, Virtual Operative Assistant.

References

    1. Schlich T. ‘The days of brilliancy are past’: skill, styles and the changing rules of surgical performance, ca. 1820-1920. Med Hist. 2015;59(3):379-403. doi:10.1017/mdh.2015.26 - DOI - PMC - PubMed
    1. Lawrence C. Medical Minds, Surgical Bodies. In: Lawrence C, Shapin S, eds. Science Incarnate: Historical Embodiments of Natural Knowledge. University of Chicago Press; 1998:156-201.
    1. Birkmeyer JD, Finks JF, O’Reilly A, et al. ; Michigan Bariatric Surgery Collaborative . Surgical skill and complication rates after bariatric surgery. N Engl J Med. 2013;369(15):1434-1442. doi:10.1056/NEJMsa1300625 - DOI - PubMed
    1. Stulberg JJ, Huang R, Kreutzer L, et al. . Association between surgeon technical skills and patient outcomes. JAMA Surg. 2020;155(10):960-968. doi:10.1001/jamasurg.2020.3007 - DOI - PMC - PubMed
    1. Rogers MP, DeSantis AJ, Janjua H, Barry TM, Kuo PC. The future surgical training paradigm: virtual reality and machine learning in surgical education. Surgery. 2021;169(5):1250-1252. doi:10.1016/j.surg.2020.09.040 - DOI - PubMed

Publication types

Associated data