Use of a Medical Communication Framework to Assess the Quality of Generative Artificial Intelligence Replies to Primary Care Patient Portal Messages: Content Analysis
- PMID: 40743559
- PMCID: PMC12313158
- DOI: 10.2196/71966
Use of a Medical Communication Framework to Assess the Quality of Generative Artificial Intelligence Replies to Primary Care Patient Portal Messages: Content Analysis
Abstract
Background: There is growing interest in applying generative artificial intelligence (GenAI) to respond to electronic patient portal messages, particularly in primary care where message volumes are highest. However, evaluations of GenAI as an inbox communication tool are limited. Qualitative analysis of when and how often GenAI responses achieve communication goals can inform estimates of impact and guide continuous improvement.
Objective: This study aims to evaluate GenAI responses to primary care messages using a medical communication framework.
Methods: This was a descriptive quality improvement study of 201 GenAI replies to a purposively sampled, diverse pool of real primary care patient messages in a large midwestern academic medical center. Two physician reviewers (NSL and NR) used a hybrid deductive-inductive approach to qualitatively identify and define themes, guided by constructs from the "best practice" medical communication framework. After achieving thematic saturation, the reviewers assessed the presence or absence of identified communication themes, both independently and collaboratively. Discrepant observations were reconciled via discussion. Frequencies of identified themes were tallied.
Results: Themes in strengths and limitations emerged across 5 communication domains. In the domain of rapport building, expressing respect and restating key phrases were strengths, while inappropriate or inadequate rapport building statements were limitations. For information gathering, questions that built toward a plan or elicited patient needs were strengths, while questions that were out of place or redundant were limitations. For information delivery, accurate content delivered clearly and professionally was a strength, but delivery of inaccurate content was an observed limitation. GenAI responses could facilitate next steps by outlining choices or providing instruction, but sometimes those next steps were inappropriate or premature. Finally, in responding to emotion, strengths were that emotions were named and validated, while inadequate or absent acknowledgment of emotion was a limitation. Overall, 26.4% (53/201) of all messages displayed communication strengths without limitations, 27.4% (55/201) had limitations without strengths, and the remaining 46.3% (93/201) had both. Strengths outnumbered limitations in rapport building (87/201, 43.3% vs 35/201, 17.4%) and facilitating next steps (73/201, 36.3% vs 39/201, 19.4%). Limitations outnumbered strengths in the remaining domains of information delivery (89/201, 44.3% vs 43/201, 21.4%), information gathering (60/201, 29.9% vs 43/201, 21.4%), and responding to emotion (7/201, 8.5% vs 9/201, 4.5%).
Conclusions: GenAI response quality on behalf of primary care physicians and advanced practice providers may vary by communication function. Expressions of respect or descriptions of common next steps may be appropriate, but gathering and delivering appropriate information, or responding to emotion, may be limited. While communication standards were often met, they were also often compromised. Understanding these strengths and limitations can inform decisions about whether, when, and how to apply GenAI as a tool for primary care inbox communication.
Keywords: artificial intelligence; communication; electronic health record; health communication; patient portal; primary care.
© Natalie S Lee, Nathan Richards, Jodi Grandominico, Robert M Cronin, Amanda K Hendricks, Ravi S Tripathi, Daniel E Jonas. Originally published in JMIR Formative Research (https://formative.jmir.org).
Conflict of interest statement
Figures
References
-
- Adler-Milstein J, Zhao W, Willard-Grace R, Knox M, Grumbach K. Electronic health records and burnout: time spent on the electronic health record after hours and message volume associated with exhaustion but not with cynicism among primary care clinicians. J Am Med Inform Assoc. 2020 Apr 1;27(4):531–538. doi: 10.1093/jamia/ocz220. doi. - DOI - PMC - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous
