Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Apr 27;33(22):e152.
doi: 10.3346/jkms.2018.33.e152. eCollection 2018 May 28.

Connecting Technological Innovation in Artificial Intelligence to Real-world Medical Practice through Rigorous Clinical Validation: What Peer-reviewed Medical Journals Could Do

Affiliations
Review

Connecting Technological Innovation in Artificial Intelligence to Real-world Medical Practice through Rigorous Clinical Validation: What Peer-reviewed Medical Journals Could Do

Seong Ho Park et al. J Korean Med Sci. .

Abstract

Artificial intelligence (AI) is projected to substantially influence clinical practice in the foreseeable future. However, despite the excitement around the technologies, it is yet rare to see examples of robust clinical validation of the technologies and, as a result, very few are currently in clinical use. A thorough, systematic validation of AI technologies using adequately designed clinical research studies before their integration into clinical practice is critical to ensure patient benefit and safety while avoiding any inadvertent harms. We would like to suggest several specific points regarding the role that peer-reviewed medical journals can play, in terms of study design, registration, and reporting, to help achieve proper and meaningful clinical validation of AI technologies designed to make medical diagnosis and prediction, focusing on the evaluation of diagnostic accuracy efficacy. Peer-reviewed medical journals can encourage investigators who wish to validate the performance of AI systems for medical diagnosis and prediction to pay closer attention to the factors listed in this article by emphasizing their importance. Thereby, peer-reviewed medical journals can ultimately facilitate translating the technological innovations into real-world practice while securing patient safety and benefit.

Keywords: Artificial Intelligence; Decision Support Techniques; Journalism, Medical; Machine Learning; Peer Review; Validation Studies.

PubMed Disclaimer

Conflict of interest statement

Disclosure: The authors have no potential conflicts of interest to disclose.

Figures

Fig. 1
Fig. 1. Typical processes for development and clinical validation of an artificial intelligence model such as a deep learning algorithm for medical diagnosis and prediction.
The dataset used to develop a deep learning algorithm is typically convenience case-control data, which is prone to spectrum bias. The algorithm development goes through training, validation, and test steps, for which the entire dataset is then split, for example, 50% for the training step and 25% each for the validation and test steps. The term validation here is a technical jargon that means tuning of the algorithm under development, unlike the commonly accepted definition in medicine/health literature as in clinical validation. The test step, if performed using the typical split-sample “internal” validation method, should be distinguished from the true external validation as the former falls short of validating the clinical performance or generalizability of the developed algorithm. The use of a dataset that is collected in a manner that minimizes spectrum bias in newly recruited patients or at different sites than the dataset used for algorithm development, which effectively represents the target patients in a real-world clinical practice, is essential for external validation of the clinical performance of an AI algorithm. AI = artificial intelligence.

References

    1. Chartrand G, Cheng PM, Vorontsov E, Drozdzal M, Turcotte S, Pal CJ, et al. Deep learning: a primer for radiologists. Radiographics. 2017;37(7):2113–2131. - PubMed
    1. Lee JG, Jun S, Cho YW, Lee H, Kim GB, Seo JB, et al. Deep learning in medical imaging: general overview. Korean J Radiol. 2017;18(4):570–584. - PMC - PubMed
    1. An intuitive explanation of convolutional neural networks. [Updated August 11, 2016]. [Accessed March 22, 2018]. https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets.
    1. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402–2410. - PubMed
    1. Ting DS, Cheung CY, Lim G, Tan GS, Quang ND, Gan A, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA. 2017;318(22):2211–2223. - PMC - PubMed