Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jan 6;22(1):11.
doi: 10.1186/s13063-020-04951-6.

Reporting guidelines for clinical trials of artificial intelligence interventions: the SPIRIT-AI and CONSORT-AI guidelines

Affiliations

Reporting guidelines for clinical trials of artificial intelligence interventions: the SPIRIT-AI and CONSORT-AI guidelines

Hussein Ibrahim et al. Trials. .

Abstract

Background: The application of artificial intelligence (AI) in healthcare is an area of immense interest. The high profile of 'AI in health' means that there are unusually strong drivers to accelerate the introduction and implementation of innovative AI interventions, which may not be supported by the available evidence, and for which the usual systems of appraisal may not yet be sufficient.

Main text: We are beginning to see the emergence of randomised clinical trials evaluating AI interventions in real-world settings. It is imperative that these studies are conducted and reported to the highest standards to enable effective evaluation because they will potentially be a key part of the evidence that is used when deciding whether an AI intervention is sufficiently safe and effective to be approved and commissioned. Minimum reporting guidelines for clinical trial protocols and reports have been instrumental in improving the quality of clinical trials and promoting completeness and transparency of reporting for the evaluation of new health interventions. The current guidelines-SPIRIT and CONSORT-are suited to traditional health interventions but research has revealed that they do not adequately address potential sources of bias specific to AI systems. Examples of elements that require specific reporting include algorithm version and the procedure for acquiring input data. In response, the SPIRIT-AI and CONSORT-AI guidelines were developed by a multidisciplinary group of international experts using a consensus building methodological process. The extensions include a number of new items that should be reported in addition to the core items. Each item, where possible, was informed by challenges identified in existing studies of AI systems in health settings.

Conclusion: The SPIRIT-AI and CONSORT-AI guidelines provide the first international standards for clinical trials of AI systems. The guidelines are designed to ensure complete and transparent reporting of clinical trial protocols and reports involving AI interventions and have the potential to improve the quality of these clinical trials through improvements in their design and delivery. Their use will help to efficiently identify the safest and most effective AI interventions and commission them with confidence for the benefit of patients and the public.

Keywords: Artificial intelligence; Checklist; Clinical trials; Guidelines; Machine learning; Randomised controlled trials; Research design; Research report.

PubMed Disclaimer

Conflict of interest statement

MJC is a National Institute for Health Research (NIHR) Senior Investigator and receives funding from the NIHR Birmingham Biomedical Research Centre, the NIHR Surgical Reconstruction and Microbiology Research Centre and NIHR ARC West Midlands at the University of Birmingham and University Hospitals Birmingham NHS Foundation Trust, Health Data Research UK, Innovate UK (part of UK Research and Innovation), Macmillan Cancer Support, UCB Pharma. The views expressed in this article are those of the author(s) and not necessarily those of the NIHR, or the Department of Health and Social Care. MJC has also received personal fees from Astellas, Takeda, Merck, Daiichi Sankyo, Glaukos, GSK, and the Patient-Centered Outcomes Research Institute (PCORI) outside the submitted work. DM is funded by a University Research Chair (uOttawa). All other authors declare that they have no competing interests.

References

    1. McKinney SM, Sieniek M, Godbole V, Godwin J, Antropova N, Ashrafian H, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788):89–94. doi: 10.1038/s41586-019-1799-6. - DOI - PubMed
    1. Abramoff MD, Lou Y, Erginay A, Clarida W, Amelon R, Folk JC, et al. Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Investig Ophthalmol Vis Sci. 2016. 10.1167/iovs.16-19964. - PubMed
    1. Bellemo V, Lim ZW, Lim G, Nguyen QD, Xie Y, Yip MYT, et al. Artificial intelligence using deep learning to screen for referable and vision-threatening diabetic retinopathy in Africa: a clinical validation study. Lancet Digital Health. 2019. 10.1016/S2589-7500(19)30004-4. - PubMed
    1. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–118. doi: 10.1038/nature21056. - DOI - PMC - PubMed
    1. Nagpal K, Foote D, Liu Y, Chen P-H, Wulczyn E, Tan F, et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer. NPJ Digit Med. 2019. 10.1038/s41746-019-0112-2. - PMC - PubMed

LinkOut - more resources