Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 1;4(1):e001632.
doi: 10.1136/bmjmed-2025-001632. eCollection 2025.

Reporting guideline for chatbot health advice studies: the Chatbot Assessment Reporting Tool (CHART) statement

Collaborators, Affiliations

Reporting guideline for chatbot health advice studies: the Chatbot Assessment Reporting Tool (CHART) statement

CHART Collaborative. BMJ Med. .

Abstract

The Chatbot Assessment Reporting Tool (CHART) is a reporting guideline developed to provide reporting recommendations for studies evaluating the performance of generative artificial intelligence (AI)-driven chatbots when summarising clinical evidence and providing health advice, referred to as chatbot health advice studies. CHART was developed in several phases after performing a comprehensive systematic review to identify variation in the conduct, reporting, and method in chatbot health advice studies. Findings from the review were used to develop a draft checklist that was revised through an international, multidisciplinary, modified, asynchronous Delphi consensus process of 531 stakeholders, three synchronous panel consensus meetings of 48 stakeholders, and subsequent pilot testing of the checklist. CHART includes 12 items and 39 subitems to promote transparent and comprehensive reporting of chatbot health advice studies. These include title (subitem 1a), abstract/summary (subitem 1b), background (subitems 2a,b), model identifiers (subitems 3a,b), model details (subitems 4a-c), prompt engineering (subitems 5a,b), query strategy (subitems 6a-d), performance evaluation (subitems 7a,b), sample size (subitem 8), data analysis (subitem 9a), results (subitems 10a-c), discussion (subitems 11a-c), disclosures (subitem 12a), funding (subitem 12b), ethics (subitem 12c), protocol (subitem 12d), and data availability (subitem 12e). The CHART checklist and corresponding diagram of the method were designed to support key stakeholders including clinicians, researchers, editors, peer reviewers, and readers in reporting, understanding, and interpreting the findings of chatbot health advice studies.

Keywords: Ethics, medical; Guideline adherence; Research design.

PubMed Disclaimer

Conflict of interest statement

All authors have completed the ICMJE uniform disclosure form at www.icmje.org/disclosure-of-interest/ and declare: support from McMaster University for the submitted work. GSC is a National Institute for Health and Care Research (NIHR) Senior Investigator. The views expressed in this article are those of the author(s) and not necessarily those of the NIHR, or the Department of Health and Social Care. AJT has received funding from HealthSense to investigate evidence-based medicine applications of large language models. PM is the co-founder of BrainX. AS has received research funding from the Australian government and is co-founder of BantingMed Pty. DS is the Acting Deputy Editor for the Lancet Digital Health. MM has received research funding from The Hospital Research Founding Group. TF sits on the executive committee of MDEpiNet. HF is a Senior Executive Editor for The Lancet. CL is editor-in-chief of the Annals of Internal Medicine. AF is executive managing editor and vice president, editorial operations, at JAMA and the JAMA Network. TF and EL are journal editors for The BMJ. RA is the editor-in-chief of the International Journal of Surgery. GS is an executive editor of Artificial Intelligence in Medicine. SL is a paid consultant for Astellas. DP has received research funding from the Italian Ministry of University and Research. MO is a paid consultant for Theator. TA, POV, and GG are board members of the MAGIC Evidence Ecosystem Foundation (www.magicproject.org), a not-for-profit organisation that conducts research and evidence appraisal and guideline methodology and implementation, and provides authoring and publication software (MAGICapp) for evidence summaries, guidelines, and decision aids.

Figures

Figure 1
Figure 1. The CHART methodological diagram. AI=artificial intelligence

References

    1. Kolbinger FR, Veldhuizen GP, Zhu J, et al. Reporting guidelines in medical artificial intelligence: a systematic review and meta-analysis. Commun Med (Lond) 2024;4:71. doi: 10.1038/s43856-024-00492-0. - DOI - PMC - PubMed
    1. Han R, Acosta JN, Shakeri Z, et al. Randomised controlled trials evaluating artificial intelligence in clinical practice: a scoping review. Lancet Digit Health. 2024;6:e367–73. doi: 10.1016/S2589-7500(24)00047-5. - DOI - PMC - PubMed
    1. Huo B, Cacciamani GE, Collins GS, et al. Reporting standards for the use of large language model-linked chatbots for health advice. Nat Med . 2023;29:2988. doi: 10.1038/s41591-023-02656-2. - DOI - PubMed
    1. Huo B, McKechnie T, Ortenzi M, et al. Dr. GPT will see you now: the ability of large language model-linked chatbots to provide colorectal cancer screening recommendations. Health Technol . 2024;14:463–9. doi: 10.1007/s12553-024-00836-9. - DOI
    1. Huo B, Marfo N, Sylla P, et al. Clinical artificial intelligence: teaching a large language model to generate recommendations that align with guidelines for the surgical management of GERD. Surg Endosc. 2024;38:5668–77. doi: 10.1007/s00464-024-11155-5. - DOI - PubMed

LinkOut - more resources