Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Mar;22(3):442-453.
doi: 10.3348/kjr.2021.0048.

Key Principles of Clinical Validation, Device Approval, and Insurance Coverage Decisions of Artificial Intelligence

Affiliations
Review

Key Principles of Clinical Validation, Device Approval, and Insurance Coverage Decisions of Artificial Intelligence

Seong Ho Park et al. Korean J Radiol. 2021 Mar.

Abstract

Artificial intelligence (AI) will likely affect various fields of medicine. This article aims to explain the fundamental principles of clinical validation, device approval, and insurance coverage decisions of AI algorithms for medical diagnosis and prediction. Discrimination accuracy of AI algorithms is often evaluated with the Dice similarity coefficient, sensitivity, specificity, and traditional or free-response receiver operating characteristic curves. Calibration accuracy should also be assessed, especially for algorithms that provide probabilities to users. As current AI algorithms have limited generalizability to real-world practice, clinical validation of AI should put it to proper external testing and assisting roles. External testing could adopt diagnostic case-control or diagnostic cohort designs. A diagnostic case-control study evaluates the technical validity/accuracy of AI while the latter tests the clinical validity/accuracy of AI in samples representing target patients in real-world clinical scenarios. Ultimate clinical validation of AI requires evaluations of its impact on patient outcomes, referred to as clinical utility, and for which randomized clinical trials are ideal. Device approval of AI is typically granted with proof of technical validity/accuracy and thus does not intend to directly indicate if AI is beneficial for patient care or if it improves patient outcomes. Neither can it categorically address the issue of limited generalizability of AI. After achieving device approval, it is up to medical professionals to determine if the approved AI algorithms are beneficial for real-world patient care. Insurance coverage decisions generally require a demonstration of clinical utility that the use of AI has improved patient outcomes.

Keywords: Artificial intelligence; Device approval; Insurance coverage; Software validation.

PubMed Disclaimer

Conflict of interest statement

The authors have no potential conflicts of interest to disclose.

Figures

Fig. 1
Fig. 1. Dice similarity coefficient.
Fig. 2
Fig. 2. Diagnostic cross-table (also referred to as confusion matrix).
AI = artificial intelligence, FN = false negative, FP = false positive, TN = true negative, TP = true positive
Fig. 3
Fig. 3. Exemplary receiver operating characteristic curves that show the performance of four readers in interpreting breast ultrasonography assisted by a deep-learning algorithm.
Adapted from Choi et al. Korean J Radiol 2019;20:749-758, with permission from the Korean Society of Radiology [4]. AUC = area under the curve
Fig. 4
Fig. 4. Exemplary free-response receiver operating characteristic curves that show the performance of six methods of detecting polyps in colonoscopy videos.
The x-axis is the mean number of false positives per image frame. A curve closer to the left upper corner indicates a higher performance, for example, a higher performance of the red curve than the blue curve. Adapted from Tajbakhsh et al. Proceedings of IEEE 12th International Symposium on Biomedical Imaging. New York: IEEE; 2015, with permission from IEEE [7]. CNN = convolutional neural network
Fig. 5
Fig. 5. Typical data sets used for development and testing of an AI algorithm.
AI = artificial intelligence

Similar articles

Cited by

References

    1. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25:44–56. - PubMed
    1. Park SH, Lim TH. Artificial intelligence: guide for healthcare personnel. Seoul: Koonja; 2020.
    1. Do S, Song KD, Chung JW. Basics of deep learning: a radiologist's guide to understanding published radiology articles on deep learning. Korean J Radiol. 2020;21:33–41. - PMC - PubMed
    1. Choi JS, Han BK, Ko ES, Bae JM, Ko EY, Song SH, et al. Effect of a deep learning framework-based computer-aided diagnosis system on the diagnostic performance of radiologists in differentiating between malignant and benign masses on breast ultrasonography. Korean J Radiol. 2019;20:749–758. - PMC - PubMed
    1. Park SH, Goo JM, Jo CH. Receiver operating characteristic (ROC) curve: practical review for radiologists. Korean J Radiol. 2004;5:11–18. - PMC - PubMed

Publication types