Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study
- PMID: 38206661
- PMCID: PMC10811574
- DOI: 10.2196/51308
Comprehensiveness, Accuracy, and Readability of Exercise Recommendations Provided by an AI-Based Chatbot: Mixed Methods Study
Abstract
Background: Regular physical activity is critical for health and disease prevention. Yet, health care providers and patients face barriers to implement evidence-based lifestyle recommendations. The potential to augment care with the increased availability of artificial intelligence (AI) technologies is limitless; however, the suitability of AI-generated exercise recommendations has yet to be explored.
Objective: The purpose of this study was to assess the comprehensiveness, accuracy, and readability of individualized exercise recommendations generated by a novel AI chatbot.
Methods: A coding scheme was developed to score AI-generated exercise recommendations across ten categories informed by gold-standard exercise recommendations, including (1) health condition-specific benefits of exercise, (2) exercise preparticipation health screening, (3) frequency, (4) intensity, (5) time, (6) type, (7) volume, (8) progression, (9) special considerations, and (10) references to the primary literature. The AI chatbot was prompted to provide individualized exercise recommendations for 26 clinical populations using an open-source application programming interface. Two independent reviewers coded AI-generated content for each category and calculated comprehensiveness (%) and factual accuracy (%) on a scale of 0%-100%. Readability was assessed using the Flesch-Kincaid formula. Qualitative analysis identified and categorized themes from AI-generated output.
Results: AI-generated exercise recommendations were 41.2% (107/260) comprehensive and 90.7% (146/161) accurate, with the majority (8/15, 53%) of inaccuracy related to the need for exercise preparticipation medical clearance. Average readability level of AI-generated exercise recommendations was at the college level (mean 13.7, SD 1.7), with an average Flesch reading ease score of 31.1 (SD 7.7). Several recurring themes and observations of AI-generated output included concern for liability and safety, preference for aerobic exercise, and potential bias and direct discrimination against certain age-based populations and individuals with disabilities.
Conclusions: There were notable gaps in the comprehensiveness, accuracy, and readability of AI-generated exercise recommendations. Exercise and health care professionals should be aware of these limitations when using and endorsing AI-based technologies as a tool to support lifestyle change involving exercise.
Keywords: AI; artificial intelligence; chatbot; exercise prescription; health literacy; large language model; patient education.
©Amanda L Zaleski, Rachel Berkowsky, Kelly Jean Thomas Craig, Linda S Pescatello. Originally published in JMIR Medical Education (https://mededu.jmir.org), 11.01.2024.
Conflict of interest statement
Conflicts of Interest: ALZ and KJTC are both employed and hold stock with CVS Health Corporation. This study is an objective evaluation to better understand ChatGPT and its outputs. To the best of our knowledge, CVS Health does not currently use or endorse the use of ChatGPT for lifestyle recommendations. LSP is the sole proprietor and founder of P3-EX, LLC, which could potentially benefit from the tool used in this research. The results of this study do not constitute endorsement by the American College of Sports Medicine.
Figures
Similar articles
-
Performance of Artificial Intelligence Chatbots on Glaucoma Questions Adapted From Patient Brochures.Cureus. 2024 Mar 23;16(3):e56766. doi: 10.7759/cureus.56766. eCollection 2024 Mar. Cureus. 2024. PMID: 38650824 Free PMC article.
-
Evaluating the Efficacy of ChatGPT as a Patient Education Tool in Prostate Cancer: Multimetric Assessment.J Med Internet Res. 2024 Aug 14;26:e55939. doi: 10.2196/55939. J Med Internet Res. 2024. PMID: 39141904 Free PMC article.
-
Both Patients and Plastic Surgeons Prefer Artificial Intelligence-Generated Microsurgical Information.J Reconstr Microsurg. 2024 Nov;40(9):657-664. doi: 10.1055/a-2273-4163. Epub 2024 Feb 21. J Reconstr Microsurg. 2024. PMID: 38382637
-
Credibility, Accuracy, and Comprehensiveness of Readily Available Internet-Based Information on Treatment and Management of Peripheral Artery Disease and Intermittent Claudication: Review.J Med Internet Res. 2022 Oct 17;24(10):e39555. doi: 10.2196/39555. J Med Internet Res. 2022. PMID: 36251363 Free PMC article. Review.
-
The Readability of AAOS Patient Education Materials: Evaluating the Progress Since 2008.J Bone Joint Surg Am. 2016 Sep 7;98(17):e70. doi: 10.2106/JBJS.15.00658. J Bone Joint Surg Am. 2016. PMID: 27605695 Review.
Cited by
-
Using Large Language Models to Enhance Exercise Recommendations and Physical Activity in Clinical and Healthy Populations: Scoping Review.JMIR Med Inform. 2025 May 27;13:e59309. doi: 10.2196/59309. JMIR Med Inform. 2025. PMID: 40424584 Free PMC article.
-
Applications of large language models in cardiovascular disease: a systematic review.Eur Heart J Digit Health. 2025 Apr 1;6(4):540-553. doi: 10.1093/ehjdh/ztaf028. eCollection 2025 Jul. Eur Heart J Digit Health. 2025. PMID: 40703130 Free PMC article. Review.
-
Artificial Intelligence-Based Clinical Decision-Making in Erectile Dysfunction: a Narrative Review.Curr Urol Rep. 2024 Dec 11;26(1):22. doi: 10.1007/s11934-024-01251-3. Curr Urol Rep. 2024. PMID: 39663266 Review.
-
Reproducibility and quality of hypertrophy-related training plans generated by GPT-4 and Google Gemini as evaluated by coaching experts.Biol Sport. 2025 Apr;42(2):289-329. doi: 10.5114/biolsport.2025.145911. Epub 2024 Dec 18. Biol Sport. 2025. PMID: 40182716 Free PMC article.
-
Artificial intelligence in cardiovascular practice.Nurse Pract. 2025 May 1;50(5):13-24. doi: 10.1097/01.NPR.0000000000000312. Epub 2025 Apr 24. Nurse Pract. 2025. PMID: 40269346 Free PMC article.
References
-
- Liguori G. ACSM's Guidelines for Exercise Testing and Prescription. 11th Edition. Philadelphia, PA: Wolters Kluwer; 2021.
-
- Piercy KL, Troiano RP, Ballard RM, Carlson SA, Fulton JE, Galuska DA, George SM, Olson RD. The physical activity guidelines for Americans. JAMA. 2018;320(19):2020–2028. doi: 10.1001/jama.2018.14854. https://europepmc.org/abstract/MED/30418471 2712935 - DOI - PMC - PubMed
-
- Joseph JJ, Deedwania P, Acharya T, Aguilar D, Bhatt DL, Chyun DA, Di Palo KE, Golden SH, Sperling LS. Comprehensive management of cardiovascular risk factors for adults with type 2 diabetes: a scientific statement from the American Heart Association. Circulation. 2022 Mar;145(9):e722–e759. doi: 10.1161/CIR.0000000000001040. https://www.ahajournals.org/doi/abs/10.1161/CIR.0000000000001040?url_ver... - DOI - DOI - PubMed
-
- Lloyd-Jones DM, Allen NB, Anderson CAM, Black T, Brewer LC, Foraker RE, Grandner MA, Lavretsky H, Perak AM, Sharma G, Rosamond W. Life's essential 8: updating and enhancing the American Heart Association's construct of cardiovascular health: a presidential advisory from the American Heart Association. Circulation. 2022 Aug 02;146(5):e18–e43. doi: 10.1161/CIR.0000000000001078. https://www.ahajournals.org/doi/abs/10.1161/CIR.0000000000001078?url_ver... - DOI - DOI - PMC - PubMed
-
- Pedersen BK, Saltin B. Exercise as medicine—evidence for prescribing exercise as therapy in 26 different chronic diseases. Scand J Med Sci Sports. 2015 Dec;25(Suppl 3):1–72. doi: 10.1111/sms.12581. https://onlinelibrary.wiley.com/doi/10.1111/sms.12581 - DOI - DOI - PubMed
MeSH terms
LinkOut - more resources
Full Text Sources