Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep;11(5):793-798.
doi: 10.1097/UPJ.0000000000000637. Epub 2024 Jun 24.

Assessing Artificial Intelligence-Generated Responses to Urology Patient In-Basket Messages

Affiliations

Assessing Artificial Intelligence-Generated Responses to Urology Patient In-Basket Messages

Michael Scott et al. Urol Pract. 2024 Sep.

Abstract

Introduction: Electronic patient messaging utilization has increased in recent years and has been associated with physician burnout. ChatGPT is a language model that has shown the ability to generate near-human level text responses. This study evaluated the quality of ChatGPT responses to real-world urology patient messages.

Methods: One hundred electronic patient messages were collected from a practicing urologist's inbox and categorized based on the question content. Individual responses were generated by entering each message into ChatGPT. The questions and responses were independently evaluated by 5 urologists and graded on a 5-point Likert scale. Questions were graded based on difficulty, and responses were graded based on accuracy, completeness, harmfulness, helpfulness, and intelligibleness. Whether or not the response could be sent to a patient was also assessed.

Results: Overall, 47% of responses were deemed acceptable to send to patients. ChatGPT performed better on easy questions with 56% of responses to easy questions being acceptable to send as compared to 34% of difficult questions (P = .03). Responses to easy questions were more accurate, complete, helpful, and intelligible than responses to difficult questions. There was no difference in response quality based on question content.

Conclusions: ChatGPT generated acceptable responses to nearly 50% of patient messages with better performance for easy questions compared to difficult questions. Use of ChatGPT to help respond to patient messages can help to decrease the time burden for the care team and improve wellness. Artificial intelligence performance will likely continue to improve with advances in generative artificial intelligence technology.

Keywords: artificial intelligence; electronic medical record; quality improvement; urology.

PubMed Disclaimer

References

LinkOut - more resources