Comparing the quality of ChatGPT- and physician-generated responses to patients' dermatology questions in the electronic medical record
- PMID: 38180108
- DOI: 10.1093/ced/llad456
Comparing the quality of ChatGPT- and physician-generated responses to patients' dermatology questions in the electronic medical record
Abstract
Background: ChatGPT is a free artificial intelligence (AI)-based natural language processing tool that generates complex responses to inputs from users.
Objectives: To determine whether ChatGPT is able to generate high-quality responses to patient-submitted questions in the patient portal.
Methods: Patient-submitted questions and the corresponding responses from their dermatology physician were extracted from the electronic medical record for analysis. The questions were input into ChatGPT (version 3.5) and the outputs extracted for analysis, with manual removal of verbiage pertaining to ChatGPT's inability to provide medical advice. Ten blinded reviewers (seven physicians and three nonphysicians) rated and selected their preference in terms of 'overall quality', 'readability', 'accuracy', 'thoroughness' and 'level of empathy' of the physician- and ChatGPT-generated responses.
Results: Thirty-one messages and responses were analysed. Physician-generated responses were vastly preferred over the ChatGPT -responses by the physician and nonphysician reviewers and received significantly higher ratings for 'readability' and 'level of empathy'.
Conclusions: The results of this study suggest that physician-generated responses to patients' portal messages are still preferred over ChatGPT, but generative AI tools may be helpful in generating the first drafts of responses and providing information on education resources for patients.
© The Author(s) 2024. Published by Oxford University Press on behalf of British Association of Dermatologists. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Conflict of interest statement
Conflicts of interest None of the authors have any relevant conflicts to disclose.
Similar articles
-
Leveraging large language models for generating responses to patient messages-a subjective analysis.J Am Med Inform Assoc. 2024 May 20;31(6):1367-1379. doi: 10.1093/jamia/ocae052. J Am Med Inform Assoc. 2024. PMID: 38497958 Free PMC article.
-
"Doctor ChatGPT, Can You Help Me?" The Patient's Perspective: Cross-Sectional Study.J Med Internet Res. 2024 Oct 1;26:e58831. doi: 10.2196/58831. J Med Internet Res. 2024. PMID: 39352738 Free PMC article.
-
ChatGPT vs. neurologists: a cross-sectional study investigating preference, satisfaction ratings and perceived empathy in responses among people living with multiple sclerosis.J Neurol. 2024 Jul;271(7):4057-4066. doi: 10.1007/s00415-024-12328-x. Epub 2024 Apr 3. J Neurol. 2024. PMID: 38568227 Free PMC article.
-
Application of ChatGPT for Orthopedic Surgeries and Patient Care.Clin Orthop Surg. 2024 Jun;16(3):347-356. doi: 10.4055/cios23181. Epub 2024 May 13. Clin Orthop Surg. 2024. PMID: 38827766 Free PMC article. Review.
-
ChatGPT and dermatology.Ital J Dermatol Venerol. 2024 Oct;159(5):566-571. doi: 10.23736/S2784-8671.24.07928-3. Epub 2024 Jul 23. Ital J Dermatol Venerol. 2024. PMID: 39039954 Review.
Cited by
-
Natural language processing in dermatology: A systematic literature review and state of the art.J Eur Acad Dermatol Venereol. 2024 Dec;38(12):2225-2234. doi: 10.1111/jdv.20286. Epub 2024 Aug 16. J Eur Acad Dermatol Venereol. 2024. PMID: 39150311 Free PMC article.
-
Comparison of Multiple State-of-the-Art Large Language Models for Patient Education Prior to CT and MRI Examinations.J Pers Med. 2025 Jun 5;15(6):235. doi: 10.3390/jpm15060235. J Pers Med. 2025. PMID: 40559098 Free PMC article.
-
Advancing Psoriasis Care through Artificial Intelligence: A Comprehensive Review.Curr Dermatol Rep. 2024 Sep;13(3):141-147. doi: 10.1007/s13671-024-00434-y. Epub 2024 Jun 13. Curr Dermatol Rep. 2024. PMID: 39301276 Free PMC article.
-
The Role of ChatGPT in Dermatology Diagnostics.Diagnostics (Basel). 2025 Jun 16;15(12):1529. doi: 10.3390/diagnostics15121529. Diagnostics (Basel). 2025. PMID: 40564849 Free PMC article. Review.
-
Testing and Evaluation of Generative Large Language Models in Electronic Health Record Applications: A Systematic Review.medRxiv [Preprint]. 2025 Jun 22:2024.08.11.24311828. doi: 10.1101/2024.08.11.24311828. medRxiv. 2025. PMID: 39228726 Free PMC article. Preprint.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources