Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2023 Dec;33(4):943-956.
doi: 10.1007/s00062-023-01291-1. Epub 2023 Jun 1.

Systematic Review of Artificial Intelligence for Abnormality Detection in High-volume Neuroimaging and Subgroup Meta-analysis for Intracranial Hemorrhage Detection

Affiliations
Meta-Analysis

Systematic Review of Artificial Intelligence for Abnormality Detection in High-volume Neuroimaging and Subgroup Meta-analysis for Intracranial Hemorrhage Detection

Siddharth Agarwal et al. Clin Neuroradiol. 2023 Dec.

Abstract

Purpose: Most studies evaluating artificial intelligence (AI) models that detect abnormalities in neuroimaging are either tested on unrepresentative patient cohorts or are insufficiently well-validated, leading to poor generalisability to real-world tasks. The aim was to determine the diagnostic test accuracy and summarise the evidence supporting the use of AI models performing first-line, high-volume neuroimaging tasks.

Methods: Medline, Embase, Cochrane library and Web of Science were searched until September 2021 for studies that temporally or externally validated AI capable of detecting abnormalities in first-line computed tomography (CT) or magnetic resonance (MR) neuroimaging. A bivariate random effects model was used for meta-analysis where appropriate. This study was registered on PROSPERO as CRD42021269563.

Results: Out of 42,870 records screened, and 5734 potentially eligible full texts, only 16 studies were eligible for inclusion. Included studies were not compromised by unrepresentative datasets or inadequate validation methodology. Direct comparison with radiologists was available in 4/16 studies and 15/16 had a high risk of bias. Meta-analysis was only suitable for intracranial hemorrhage detection in CT imaging (10/16 studies), where AI systems had a pooled sensitivity and specificity 0.90 (95% confidence interval [CI] 0.85-0.94) and 0.90 (95% CI 0.83-0.95), respectively. Other AI studies using CT and MRI detected target conditions other than hemorrhage (2/16), or multiple target conditions (4/16). Only 3/16 studies implemented AI in clinical pathways, either for pre-read triage or as post-read discrepancy identifiers.

Conclusion: The paucity of eligible studies reflects that most abnormality detection AI studies were not adequately validated in representative clinical cohorts. The few studies describing how abnormality detection AI could impact patients and clinicians did not explore the full ramifications of clinical implementation.

Keywords: Anomaly detection; Brain MRI; Clinical validation; Deep learning; Machine learning.

PubMed Disclaimer

Conflict of interest statement

S. Agarwal, D. Wood, M. Grzeda, C. Suresh, M. Din, J. Cole, M. Modat and T.C. Booth declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Summary of the QUADAS‑2 risk of bias assessment
Fig. 2
Fig. 2
Diagnostic test accuracy of algorithms and comparators (single radiologists) in precision-recall space, for intracranial hemorrhage detection in CT imaging, when adjusted for an intracranial hemorrhage prevalence of 10%. The prevalence-adjusted positive predictive value (PPV) can be interpreted as the PPV you would expect for each model if the prevalence of ICH within the test dataset was 10%. An ideal classifier would be at the top-right corner at (1, 1). The size of each marker is proportional to the size of the test dataset. In studies where there was more than one operating point, we chose the operating point with the highest specificity. CQ500 = CQ500 external test dataset. Qure.ai, Aidoc and Avicenna.ai are commercial vendors for AI products
Fig. 3
Fig. 3
Forest plots demonstrating individual studies’ sensitivities and specificities
Fig. 4
Fig. 4
Summary receiver operating characteristic (SROC) curve for intracranial hemorrhage detection in CT imaging. A bivariate random effects model was used for meta-analysis, which allowed the estimation of the summary ROC (SROC) curve

References

    1. Booth TC, Luis A, Brazil L, Thompson G, Daniel RA, Shuaib H, et al. Glioblastoma post-operative imaging in neuro-oncology: current UK practice (GIN CUP study) Eur Radiol. 2021;31:2933–2943. doi: 10.1007/s00330-020-07387-3. - DOI - PMC - PubMed
    1. Dixon S. Diagnostic imaging dataset annual statistical release 2020/21. 2021. https://www.england.nhs.uk/statistics/statistical-work-areas/diagnostic-.... Accessed 20 Mar 2023.
    1. The Royal College of Radiologists London. Clinical radiology UK workforce census 2020 report. 2020. https://www.rcr.ac.uk/system/files/publication/field_publication_files/c.... Accessed 20 Mar 2023.
    1. World Health Organization. Cancer control: early detection. WHO Guide for effective programmes. 2007. http://apps.who.int/iris/bitstream/10665/43743/1/9241547338_eng. Accessed 15 Feb 2022.
    1. Lee JY, Kim JS, Kim TY, Kim YS. Detection and classification of intracranial haemorrhage on CT images using a novel deep-learning algorithm. Sci Rep. 2020;10:1–7. - PMC - PubMed