Vision-Enabled AI scribes reduce omissions in clinical conversations: evidence from simulated medication histories
- PMID: 41748705
- DOI: 10.1038/s41746-026-02494-9
Vision-Enabled AI scribes reduce omissions in clinical conversations: evidence from simulated medication histories
Abstract
Most ambient AI medical scribes process audio only, omitting clinically important visual details. We developed a vision-enabled AI scribe using Google's Gemini model and Ray-Ban Meta smart glasses to document medication histories-a task requiring both audio and visual input. Ten clinical pharmacists video-recorded 110 simulated medication history interviews. Following iterative prompt engineering on 10 training recordings, the scribe was evaluated on 100 test recordings (2160 data points) across patient details and medication-specific fields. The vision-enabled scribe achieved 98% overall accuracy (2114/2,160 data points), ranging from 96% for patient details to 99% for dosing directions and indication. Video input significantly outperformed audio-only processing (98% vs 81%, P < 0.001), primarily through reduced omissions (10 vs 358 errors). Vision-enabled AI scribes substantially improved documentation accuracy for tasks requiring visual input, demonstrating potential to markedly reduce omission errors in clinical documentation.
© 2026. The Author(s).
Conflict of interest statement
Competing interests: A.M.H. is a recipient of investigator-initiated funding for research outside the scope of the current study from Boehringer Ingelheim. A.R. and M.J.S. are recipients of investigator-initiated funding for research outside the scope of the current study from AstraZeneca, Boehringer Ingelheim, Pfizer and Takeda. A.R. is a recipient of speaker fees from Boehringer Ingelheim and Genentech. The author team have no other potential conflicts of interest with respect to this research and/or publication to declare.
References
-
- Tierney, A. A. et al. Ambient artificial intelligence scribes to alleviate the burden of clinical documentation. NEJM Catalyst 5, CAT.23.0404 (2024).
-
- Sorich, M. J., Mangoni, A. A., Bacchi, S., Menz, B. D. & Hopkins, A. M. The triage and diagnostic accuracy of frontier large language models: updated comparison to physician performance. J. Med. Internet Res. 26, e67409 (2024).
-
- Menz, B. D. et al. Generative AI chatbots for reliable cancer information: evaluating web-search, multilingual, and reference capabilities of emerging large language models. Eur. J. Cancer 218, 115274 (2025).
-
- Shahnam, A. et al. Application of generative artificial intelligence for physician and patient oncology letters—AI-OncLetters. JCO Clin. Cancer Inform. e2400323 (2025).
-
- Zaretsky, J. et al. Generative artificial intelligence to transform inpatient discharge summaries to patient-friendly language and format. JAMA Netw. Open 7, e240357 (2024).
Grants and funding
LinkOut - more resources
Full Text Sources
