Review

. 2023 Feb;20(2):134-145.

doi: 10.1016/j.jacr.2022.05.022. Epub 2022 Jul 31.

The Low Rate of Adherence to Checklist for Artificial Intelligence in Medical Imaging Criteria Among Published Prostate MRI Artificial Intelligence Algorithms

Mason J Belue¹, Stephanie A Harmon², Nathan S Lay², Asha Daryanani³, Tim E Phelps⁴, Peter L Choyke⁵, Baris Turkbey⁶

Affiliations

¹ Medical Research Scholars Program Fellow, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
² Staff Scientist, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
³ Intramural Research Training Program Fellow, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
⁴ Postdoctoral Fellow, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
⁵ Artificial Intelligence Resource, Chief of Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
⁶ Senior Clinician/Director, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland. Electronic address: turkbeyi@mail.nih.gov.

PMID: 35922018
PMCID: PMC9887098
DOI: 10.1016/j.jacr.2022.05.022

Review

The Low Rate of Adherence to Checklist for Artificial Intelligence in Medical Imaging Criteria Among Published Prostate MRI Artificial Intelligence Algorithms

Mason J Belue et al. J Am Coll Radiol. 2023 Feb.

. 2023 Feb;20(2):134-145.

doi: 10.1016/j.jacr.2022.05.022. Epub 2022 Jul 31.

Authors

Mason J Belue¹, Stephanie A Harmon², Nathan S Lay², Asha Daryanani³, Tim E Phelps⁴, Peter L Choyke⁵, Baris Turkbey⁶

Affiliations

¹ Medical Research Scholars Program Fellow, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
² Staff Scientist, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
³ Intramural Research Training Program Fellow, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
⁴ Postdoctoral Fellow, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
⁵ Artificial Intelligence Resource, Chief of Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.
⁶ Senior Clinician/Director, Artificial Intelligence Resource, Molecular Imaging Branch, National Cancer Institute, National Institutes of Health, Bethesda, Maryland. Electronic address: turkbeyi@mail.nih.gov.

PMID: 35922018
PMCID: PMC9887098
DOI: 10.1016/j.jacr.2022.05.022

Abstract

Objective: To determine the rigor, generalizability, and reproducibility of published classification and detection artificial intelligence (AI) models for prostate cancer (PCa) on MRI using the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) guidelines, a 42-item checklist that is considered a measure of best practice for presenting and reviewing medical imaging AI research.

Materials and methods: This review searched English literature for studies proposing PCa AI detection and classification models on MRI. Each study was evaluated with the CLAIM checklist. The additional outcomes for which data were sought included measures of AI model performance (eg, area under the curve [AUC], sensitivity, specificity, free-response operating characteristic curves), training and validation and testing group sample size, AI approach, detection versus classification AI, public data set utilization, MRI sequences used, and definition of gold standard for ground truth. The percentage of CLAIM checklist fulfillment was used to stratify studies into quartiles. Wilcoxon's rank-sum test was used for pair-wise comparisons.

Results: In all, 75 studies were identified, and 53 studies qualified for analysis. The original CLAIM items that most studies did not fulfill includes item 12 (77% no): de-identification methods; item 13 (68% no): handling missing data; item 15 (47% no): rationale for choosing ground truth reference standard; item 18 (55% no): measurements of inter- and intrareader variability; item 31 (60% no): inclusion of validated interpretability maps; item 37 (92% no): inclusion of failure analysis to elucidate AI model weaknesses. An AUC score versus percentage CLAIM fulfillment quartile revealed a significant difference of the mean AUC scores between quartile 1 versus quartile 2 (0.78 versus 0.86, P = .034) and quartile 1 versus quartile 4 (0.78 versus 0.89, P = .003) scores. Based on additional information and outcome metrics gathered in this study, additional measures of best practice are defined. These new items include disclosure of public dataset usage, ground truth definition in comparison to other referenced works in the defined task, and sample size power calculation.

Conclusion: A large proportion of AI studies do not fulfill key items in CLAIM guidelines within their methods and results sections. The percentage of CLAIM checklist fulfillment is weakly associated with improved AI model performance. Additions or supplementations to CLAIM are recommended to improve publishing standards and aid reviewers in determining study rigor.

Keywords: AI; CLAIM; classification; detection; prostate cancer; study rigor.

Published by Elsevier Inc.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest

The authors declare no conflict of interest.

Figures

**Figure 1:**
Flow diagram of paper/study selection

**Figure 2:**
Total Paper Performance per 42 CLAIM items. “Yes” was given if the specific CLAIM number was either included within the text or within supplementary material of the study. “No” was given if the CLAIM number couldn’t be found and “N/A” was given if the CLAIM number didn’t apply to a particular study.

**Figure 3:**
Overall results by CLAIM section (represented by percentage out of 100) across all 53 studies.

**Figure 4:**
Overall results by CLAIM section stratified by Method (machine learning (A) vs deep learning (B))

**Figure 5:**
Impact of CLAIM fulfillment on AUC Score Stratified by CLAIM Binary Quartiles (Excluding N/A).

See this image and copyright information in PMC

References

1. Key Statistics for Prostate Cancer | Prostate Cancer Facts [Internet]. Available from: https://www.cancer.org/cancer/prostate-cancer/about/key-statistics.html
1. Harmon SA, Tuncer S, Sanford T, Choyke PL, Türkbey B. Artificial intelligence at the intersection of pathology and radiology in prostate cancer. Diagn Interv Radiol. 2019. May;25(3):183–8. - PMC - PubMed
1. Kwon D, Reis IM, Breto AL, Tschudi Y, Gautney N, Zavala-Romero O, et al. Classification of suspicious lesions on prostate multiparametric MRI using machine learning. J Med Imaging Bellingham Wash. 2018. Jul;5(3):034502. - PMC - PubMed
1. Gaur S, Lay N, Harmon SA, Doddakashi S, Mehralivand S, Argun B, et al. Can computer-aided diagnosis assist in the identification of prostate cancer on prostate MRI? a multi-center, multi-reader investigation. Oncotarget. 2018. Sep 18;9(73):33804–17. - PMC - PubMed
1. Wildeboer RR, van Sloun RJG, Wijkstra H, Mischi M. Artificial intelligence in multiparametric prostate cancer imaging with focus on deep-learning methods. Comput Methods Programs Biomed. 2020. Jun;189:105316. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

ZIA BC012062/ImNIH/Intramural NIH HHS/United States

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The Low Rate of Adherence to Checklist for Artificial Intelligence in Medical Imaging Criteria Among Published Prostate MRI Artificial Intelligence Algorithms

Affiliations

The Low Rate of Adherence to Checklist for Artificial Intelligence in Medical Imaging Criteria Among Published Prostate MRI Artificial Intelligence Algorithms

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources