Evaluating the Quality and Usability of Artificial Intelligence-Generated Responses to Common Patient Questions in Foot and Ankle Surgery
- PMID: 38027458
- PMCID: PMC10666700
- DOI: 10.1177/24730114231209919
Evaluating the Quality and Usability of Artificial Intelligence-Generated Responses to Common Patient Questions in Foot and Ankle Surgery
Abstract
Background: Artificial intelligence (AI) platforms, such as ChatGPT, have become increasingly popular outlets for the consumption and distribution of health care-related advice. Because of a lack of regulation and oversight, the reliability of health care-related responses has become a topic of controversy in the medical community. To date, no study has explored the quality of AI-derived information as it relates to common foot and ankle pathologies. This study aims to assess the quality and educational benefit of ChatGPT responses to common foot and ankle-related questions.
Methods: ChatGPT was asked a series of 5 questions, including "What is the optimal treatment for ankle arthritis?" "How should I decide on ankle arthroplasty versus ankle arthrodesis?" "Do I need surgery for Jones fracture?" "How can I prevent Charcot arthropathy?" and "Do I need to see a doctor for my ankle sprain?" Five responses (1 per each question) were included after applying the exclusion criteria. The content was graded using DISCERN (a well-validated informational analysis tool) and AIRM (a self-designed tool for exercise evaluation).
Results: Health care professionals graded the ChatGPT-generated responses as bottom tier 4.5% of the time, middle tier 27.3% of the time, and top tier 68.2% of the time.
Conclusion: Although ChatGPT and other related AI platforms have become a popular means for medical information distribution, the educational value of the AI-generated responses related to foot and ankle pathologies was variable. With 4.5% of responses receiving a bottom-tier rating, 27.3% of responses receiving a middle-tier rating, and 68.2% of responses receiving a top-tier rating, health care professionals should be aware of the high viewership of variable-quality content easily accessible on ChatGPT.
Level of evidence: Level III, cross sectional study.
Keywords: Charcot arthropathy; ChatGPT; Jones fracture; ankle arthritis; ankle arthroplasty; ankle sprain; artificial intelligence; education.
© The Author(s) 2023.
Conflict of interest statement
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. ICMJE forms for all authors are available online.
References
-
- Ashkani-Esfahani S, Mojahed-Yazdi R, Bhimani R, et al.. Deep learning algorithms improve the detection of subtle Lisfranc malalignments on weightbearing radiographs. Foot Ankle Int. 2022;43(8):1118-1126. - PubMed
-
- Ashkani-Esfahani S, Yazdi RM, Bhimani R, et al.. Detection of ankle fractures using deep learning algorithms. Foot Ankle Surg. 2022;28(8):1259-1265. - PubMed
-
- Farda NA, Lai JY, Wang JC, Lee PY, Liu JW, Hsieh IH. Sanders classification of calcaneal fractures in CT images with deep learning and differential data augmentation techniques. Injury. 2021;52(3):616-624. - PubMed
LinkOut - more resources
Full Text Sources
Research Materials