Patient- and clinician-based evaluation of large language models for patient education in prostate cancer radiotherapy
- PMID: 39792259
- PMCID: PMC11839798
- DOI: 10.1007/s00066-024-02342-3
Patient- and clinician-based evaluation of large language models for patient education in prostate cancer radiotherapy
Abstract
Background: This study aims to evaluate the capabilities and limitations of large language models (LLMs) for providing patient education for men undergoing radiotherapy for localized prostate cancer, incorporating assessments from both clinicians and patients.
Methods: Six questions about definitive radiotherapy for prostate cancer were designed based on common patient inquiries. These questions were presented to different LLMs [ChatGPT‑4, ChatGPT-4o (both OpenAI Inc., San Francisco, CA, USA), Gemini (Google LLC, Mountain View, CA, USA), Copilot (Microsoft Corp., Redmond, WA, USA), and Claude (Anthropic PBC, San Francisco, CA, USA)] via the respective web interfaces. Responses were evaluated for readability using the Flesch Reading Ease Index. Five radiation oncologists assessed the responses for relevance, correctness, and completeness using a five-point Likert scale. Additionally, 35 prostate cancer patients evaluated the responses from ChatGPT‑4 for comprehensibility, accuracy, relevance, trustworthiness, and overall informativeness.
Results: The Flesch Reading Ease Index indicated that the responses from all LLMs were relatively difficult to understand. All LLMs provided answers that clinicians found to be generally relevant and correct. The answers from ChatGPT‑4, ChatGPT-4o, and Claude AI were also found to be complete. However, we found significant differences between the performance of different LLMs regarding relevance and completeness. Some answers lacked detail or contained inaccuracies. Patients perceived the information as easy to understand and relevant, with most expressing confidence in the information and a willingness to use ChatGPT‑4 for future medical questions. ChatGPT-4's responses helped patients feel better informed, despite the initially standardized information provided.
Conclusion: Overall, LLMs show promise as a tool for patient education in prostate cancer radiotherapy. While improvements are needed in terms of accuracy and readability, positive feedback from clinicians and patients suggests that LLMs can enhance patient understanding and engagement. Further research is essential to fully realize the potential of artificial intelligence in patient education.
Keywords: AI; Artificial intelligence; ChatGPT; Patient information; Radiation oncology.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Conflict of interest: C. Trapp, N. Schmidt-Hegemann, M. Keilholz, S.F. Brose, S.N. Marschner, S. Schönecker, S.H. Maier, D.-C. Dehelean, M. Rottler, D. Konnerth, C. Belka, S. Corradini, and P. Rogowski declare that they have no competing interests. Ethical standards: All procedures performed in studies involving human participants or on human tissue were in accordance with the ethical standards of the institutional and/or national research committee and with the 1975 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
Figures





Similar articles
-
Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.Cureus. 2024 Aug 28;16(8):e67996. doi: 10.7759/cureus.67996. eCollection 2024 Aug. Cureus. 2024. PMID: 39347335 Free PMC article.
-
Assessing the Responses of Large Language Models (ChatGPT-4, Gemini, and Microsoft Copilot) to Frequently Asked Questions in Breast Imaging: A Study on Readability and Accuracy.Cureus. 2024 May 9;16(5):e59960. doi: 10.7759/cureus.59960. eCollection 2024 May. Cureus. 2024. PMID: 38726360 Free PMC article.
-
Evaluating text and visual diagnostic capabilities of large language models on questions related to the Breast Imaging Reporting and Data System Atlas 5th edition.Diagn Interv Radiol. 2025 Mar 3;31(2):111-129. doi: 10.4274/dir.2024.242876. Epub 2024 Sep 9. Diagn Interv Radiol. 2025. PMID: 39248152 Free PMC article.
-
Exploring the role of artificial intelligence, large language models: Comparing patient-focused information and clinical decision support capabilities to the gynecologic oncology guidelines.Int J Gynaecol Obstet. 2025 Feb;168(2):419-427. doi: 10.1002/ijgo.15869. Epub 2024 Aug 20. Int J Gynaecol Obstet. 2025. PMID: 39161265 Free PMC article. Review.
-
What is the role of large language models in the management of urolithiasis?: a review.Urolithiasis. 2025 May 15;53(1):92. doi: 10.1007/s00240-025-01761-w. Urolithiasis. 2025. PMID: 40372452 Review.
Cited by
-
Large language model integrations in cancer decision-making: a systematic review and meta-analysis.NPJ Digit Med. 2025 Jul 17;8(1):450. doi: 10.1038/s41746-025-01824-7. NPJ Digit Med. 2025. PMID: 40676129 Free PMC article.
-
How Accurate Is AI? A Critical Evaluation of Commonly Used Large Language Models in Responding to Patient Concerns About Incidental Kidney Tumors.J Clin Med. 2025 Aug 12;14(16):5697. doi: 10.3390/jcm14165697. J Clin Med. 2025. PMID: 40869522 Free PMC article.
-
How Well Do Different AI Language Models Inform Patients About Radiofrequency Ablation for Varicose Veins?Cureus. 2025 Jun 22;17(6):e86537. doi: 10.7759/cureus.86537. eCollection 2025 Jun. Cureus. 2025. PMID: 40698235 Free PMC article.
-
Assessing ChatGPT's Educational Potential in Lung Cancer Radiotherapy From Clinician and Patient Perspectives: Content Quality and Readability Analysis.JMIR Cancer. 2025 Aug 13;11:e69783. doi: 10.2196/69783. JMIR Cancer. 2025. PMID: 40802978 Free PMC article.
References
-
- Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW (2023) Large language models in medicine. Nat Med 29(8):1930–1940. 10.1038/s41591-023-02448-8 - PubMed
-
- Patel SB, Lam K (2023) ChatGPT: the future of discharge summaries? Lancet 5(3):21–23. 10.1016/S2589-7500 - PubMed
-
- Howard A, Hope W, Gerada A (2023) ChatGPT and antimicrobial advice: the end of the consulting infection doctor? Lancet Infect Dis 23(4):405–406. 10.1016/S1473-3099(23)00113-5 - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources
Medical