The TRIPOD-LLM reporting guideline for studies using large language models
- PMID: 39779929
- PMCID: PMC12104976
- DOI: 10.1038/s41591-024-03425-5
The TRIPOD-LLM reporting guideline for studies using large language models
Abstract
Large language models (LLMs) are rapidly being adopted in healthcare, necessitating standardized reporting guidelines. We present transparent reporting of a multivariable model for individual prognosis or diagnosis (TRIPOD)-LLM, an extension of the TRIPOD + artificial intelligence statement, addressing the unique challenges of LLMs in biomedical applications. TRIPOD-LLM provides a comprehensive checklist of 19 main items and 50 subitems, covering key aspects from title to discussion. The guidelines introduce a modular format accommodating various LLM research designs and tasks, with 14 main items and 32 subitems applicable across all categories. Developed through an expedited Delphi process and expert consensus, TRIPOD-LLM emphasizes transparency, human oversight and task-specific performance reporting. We also introduce an interactive website ( https://tripod-llm.vercel.app/ ) facilitating easy guideline completion and PDF generation for submission. As a living document, TRIPOD-LLM will evolve with the field, aiming to enhance the quality, reproducibility and clinical applicability of LLM research in healthcare through comprehensive reporting.
© 2025. The Author(s), under exclusive licence to Springer Nature America, Inc.
Conflict of interest statement
Competing interests: D.S.B. is an associate editor at Radiation Oncology and HemOnc.org, receives research funding from the American Association for Cancer Research, and provides advisory and consulting services for MercurialAI. D.D.F. is an associate editor at the Journal of the American Medical Informatics Association, is a member of the editorial board of Scientific Data, and receives funding from the intramural research program at the US National Library of Medicine, NIH. J.W.G. is a member of the editorial board of Radiology: Artificial Intelligence, BJR Artificial Intelligence and NEJM AI. All other authors declare no competing interests.
Figures

Update of
-
The TRIPOD-LLM Statement: A Targeted Guideline For Reporting Large Language Models Use.medRxiv [Preprint]. 2024 Jul 25:2024.07.24.24310930. doi: 10.1101/2024.07.24.24310930. medRxiv. 2024. Update in: Nat Med. 2025 Jan;31(1):60-69. doi: 10.1038/s41591-024-03425-5. PMID: 39211885 Free PMC article. Updated. Preprint.
References
-
- Chen Z et al. MEDITRON-70B: Scaling Medical Pretraining for Large Language Models. Preprint at 10.48550/arXiv.2311.16079 (2023). - DOI
-
- OpenAI. GPT-4 Technical Report. Preprint at 10.48550/arXiv.2303.08774 (2023). - DOI
-
- Tierney AA et al. Ambient Artificial Intelligence Scribes to Alleviate the Burden of Clinical Documentation. NEJM Catal. 5, CAT.23.0404 (2024).
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials