Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Aug 1;23(1):447.
doi: 10.1186/s12916-025-04274-w.

Reporting guideline for Chatbot Health Advice studies: the CHART statement

Affiliations
Review

Reporting guideline for Chatbot Health Advice studies: the CHART statement

Bright Huo et al. BMC Med. .

Abstract

Background: The Chatbot Assessment Reporting Tool (CHART) is a reporting guideline developed to provide reporting recommendations for studies evaluating the performance of generative artificial intelligence (AI)-driven chatbots when summarizing clinical evidence and providing health advice, referred to as Chatbot Health Advice (CHA) studies.

Methods: CHART was developed in several phases after performing a comprehensive systematic review to identify variation in the conduct, reporting, and methodology in CHA studies. Findings from the review were used to develop a draft checklist that was revised through an international, multidisciplinary modified asynchronous Delphi consensus process of 531 stakeholders, three synchronous panel consensus meetings of 48 stakeholders, and subsequent pilot testing of the checklist.

Results: CHART includes 12 items and 39 subitems to promote transparent and comprehensive reporting of CHA studies. These include Title (subitem 1a), Abstract/Summary (subitem 1b), Background (subitems 2ab), Model Identifiers (subitems 3ab), Model Details (subitems 4abc), Prompt Engineering (subitems 5ab), Query Strategy (subitems 6abcd), Performance Evaluation (subitems 7ab), Sample Size (subitem 8), Data Analysis (subitem 9a), Results (subitems 10abc), Discussion (subitems 11abc), Disclosures (subitem 12a), Funding (subitem 12b), Ethics (subitem 12c), Protocol (subitem 12d), and Data Availability (subitem 12e).

Conclusion: The CHART checklist and corresponding methodological diagram were designed to support key stakeholders including clinicians, researchers, editors, peer reviewers, and readers in reporting, understanding, and interpreting the findings of CHA studies.

Keywords: Generative AI; LLMs; Reporting standards.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Ethics approval was submitted to and waived by the Hamilton Integrated Research Ethics Board (HiREB #17025). Consent for publication: Not applicable. Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/disclosure-of-interest/ and declare: GSC is a National Institute for Health and Care Research (NIHR) Senior Investigator. The views expressed in this article are those of the author(s) and not necessarily those of the NIHR, or the Department of Health and Social Care; AJT has received funding from HealthSense to investigate evidence-based medicine applications of large language models. PM is the co-founder of BrainX LLC; AS has received research funding from the Australian government and is co-founder of BantingMed Pty Ltd; DS is the Acting Deputy Editor for the Lancet Digital Health; MM has received research funding from The Hospital Research Founding Group; TF sits on the executive committee of MDEpiNet; HF is a Senior Executive Editor for The Lancet; CL is the Editor in Chief of Annals of Internal Medicine; AF is Executive Managing Editor and Vice President, Editorial Operations, JAMA and The JAMA Network; TF and EL are journal editors for the BMJ; RA is the Editor in Chief of International Journal of Surgery; GS is an Executive Editor of Artificial Intelligence in Medicine; SL is a paid consultant for Astellas; DP has received research funding from the Italian Ministry of University and Research; MO is a paid consultant for Theator; TA, POV, GG are board member of the MAGIC Evidence Ecosystem Foundation ( www.magicproject.org ), a non-for profit organization, which conducts research and evidence appraisal and guideline methodology and implementation, and which provides a authoring and publication software (MAGICapp) for evidence summaries, guidelines and decision aids.

Figures

Fig. 1
Fig. 1
The CHART Methodological Diagram

References

    1. Kolbinger FR, Veldhuizen GP, Zhu J, Truhn D, Kather JN. Reporting guidelines in medical artificial intelligence: a systematic review and meta-analysis. Commun Med. 2024;4:1. - PMC - PubMed
    1. Han R, Acosta JN, Shakeri Z, Ioannidis JPA, Topol EJ, Rajpurkar P. Randomised controlled trials evaluating artificial intelligence in clinical practice: a scoping review. Lancet Digit Health. 2024;6:e367–73. - PMC - PubMed
    1. Huo B, Cacciamani GE, Collins GS, McKechnie T, Lee Y, Guyatt G. Reporting standards for the use of large language model-linked chatbots for health advice. Nat Med. 2023;29:2988. - PubMed
    1. Huo B, McKechnie T, Ortenzi M, Lee Y, Antoniou S, Mayol J, et al. Dr. GPT will see you now: the ability of large language model-linked chatbots to provide colorectal cancer screening recommendations. Health Technol. 2024;14:463–9.
    1. Huo B, Marfo N, Sylla P, Calabrese E, Kumar S, Slater BJ, et al. Clinical artificial intelligence: teaching a large language model to generate recommendations that align with guidelines for the surgical management of GERD. Surg Endosc. 2024;38:5668–77. - PubMed

LinkOut - more resources