Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 May 4;7(1):114.
doi: 10.1038/s41746-024-01106-8.

Artificial intelligence in digital pathology: a systematic review and meta-analysis of diagnostic test accuracy

Affiliations
Review

Artificial intelligence in digital pathology: a systematic review and meta-analysis of diagnostic test accuracy

Clare McGenity et al. NPJ Digit Med. .

Abstract

Ensuring diagnostic performance of artificial intelligence (AI) before introduction into clinical practice is essential. Growing numbers of studies using AI for digital pathology have been reported over recent years. The aim of this work is to examine the diagnostic accuracy of AI in digital pathology images for any disease. This systematic review and meta-analysis included diagnostic accuracy studies using any type of AI applied to whole slide images (WSIs) for any disease. The reference standard was diagnosis by histopathological assessment and/or immunohistochemistry. Searches were conducted in PubMed, EMBASE and CENTRAL in June 2022. Risk of bias and concerns of applicability were assessed using the QUADAS-2 tool. Data extraction was conducted by two investigators and meta-analysis was performed using a bivariate random effects model, with additional subgroup analyses also performed. Of 2976 identified studies, 100 were included in the review and 48 in the meta-analysis. Studies were from a range of countries, including over 152,000 whole slide images (WSIs), representing many diseases. These studies reported a mean sensitivity of 96.3% (CI 94.1-97.7) and mean specificity of 93.3% (CI 90.5-95.4). There was heterogeneity in study design and 99% of studies identified for inclusion had at least one area at high or unclear risk of bias or applicability concerns. Details on selection of cases, division of model development and validation data and raw performance data were frequently ambiguous or missing. AI is reported as having high diagnostic accuracy in the reported areas but requires more rigorous evaluation of its performance.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Example whole slide image (WSI) of a liver biopsy specimen at low magnification.
These are high resolution digital pathology images viewed by a pathologist on a computer to make a diagnostic assessment. Image from www.virtualpathology.leeds.ac.uk.
Fig. 2
Fig. 2. Study selection flow diagram.
Generated using PRISMA2020 at https://estech.shinyapps.io/prisma_flowdiagram/.
Fig. 3
Fig. 3. Risk of bias and concerns of applicability in summary percentages for studies included in the review.
a Summaries for risk of bias for all 100 papers included in the review. b Summaries for applicability concerns for all 100 papers included in the review. c, d Summaries for risk of bias for 48 papers included in the meta-analysis. d Summaries for applicability concerns for 48 papers included in the meta-analysis.
Fig. 4
Fig. 4. Forest plots of performance across studies included in the meta-analysis.
These show sensitivity (a) and specificity (b) in studies of all pathologies with 95% confidence intervals. These plots were generated by MetaDTA: Diagnostic Test Accuracy Meta-Analysis v2.01 Shiny App https://crsu.shinyapps.io/MetaDTA/ and the raw data can be found in Supplementary Table 4,.
Fig. 5
Fig. 5. Summary receiver operating characteristic plot of AI applied to whole slide images for all disease types generated from MetaDTA: diagnostic test accuracy meta-analysis v2.01 Shiny App https://crsu.shinyapps.io/dta_ma/,.
95% confidence intervals are shown around the summary estimate. The predictive region shows the area of 95% confidence in which the true sensitivity and specificity of future studies lies, whilst factoring the statistical heterogeneity of studies demonstrated in this review.

References

    1. Vaswani, A. et al. Attention is all you need. In Advances in neural information processing systems 30 (NeurIPS, 2017).
    1. Silver D, et al. Mastering the game of Go with deep neural networks and tree search. Nature. 2016;529:484–489. doi: 10.1038/nature16961. - DOI - PubMed
    1. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat. Med. 2022;28:31–38. doi: 10.1038/s41591-021-01614-0. - DOI - PubMed
    1. Baxi V, Edwards R, Montalto M, Saha S. Digital pathology and artificial intelligence in translational medicine and clinical practice. Mod. Pathol. 2022;35:23–32. doi: 10.1038/s41379-021-00919-2. - DOI - PMC - PubMed
    1. Tizhoosh HR, Pantanowitz L. Artificial intelligence and digital pathology: challenges and opportunities. J. Pathol. Inf. 2018;9:38. doi: 10.4103/jpi.jpi_53_18. - DOI - PMC - PubMed