Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar;45(1):13-29.
doi: 10.1007/s13246-021-01093-0. Epub 2021 Dec 17.

Automated COVID-19 diagnosis and prognosis with medical imaging and who is publishing: a systematic review

Affiliations

Automated COVID-19 diagnosis and prognosis with medical imaging and who is publishing: a systematic review

Ashley G Gillman et al. Phys Eng Sci Med. 2022 Mar.

Abstract

Objectives: To conduct a systematic survey of published techniques for automated diagnosis and prognosis of COVID-19 diseases using medical imaging, assessing the validity of reported performance and investigating the proposed clinical use-case. To conduct a scoping review into the authors publishing such work.

Methods: The Scopus database was queried and studies were screened for article type, and minimum source normalized impact per paper and citations, before manual relevance assessment and a bias assessment derived from a subset of the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). The number of failures of the full CLAIM was adopted as a surrogate for risk-of-bias. Methodological and performance measurements were collected from each technique. Each study was assessed by one author. Comparisons were evaluated for significance with a two-sided independent t-test.

Findings: Of 1002 studies identified, 390 remained after screening and 81 after relevance and bias exclusion. The ratio of exclusion for bias was 71%, indicative of a high level of bias in the field. The mean number of CLAIM failures per study was 8.3 ± 3.9 [1,17] (mean ± standard deviation [min,max]). 58% of methods performed diagnosis versus 31% prognosis. Of the diagnostic methods, 38% differentiated COVID-19 from healthy controls. For diagnostic techniques, area under the receiver operating curve (AUC) = 0.924 ± 0.074 [0.810,0.991] and accuracy = 91.7% ± 6.4 [79.0,99.0]. For prognostic techniques, AUC = 0.836 ± 0.126 [0.605,0.980] and accuracy = 78.4% ± 9.4 [62.5,98.0]. CLAIM failures did not correlate with performance, providing confidence that the highest results were not driven by biased papers. Deep learning techniques reported higher AUC (p < 0.05) and accuracy (p < 0.05), but no difference in CLAIM failures was identified.

Interpretation: A majority of papers focus on the less clinically impactful diagnosis task, contrasted with prognosis, with a significant portion performing a clinically unnecessary task of differentiating COVID-19 from healthy. Authors should consider the clinical scenario in which their work would be deployed when developing techniques. Nevertheless, studies report superb performance in a potentially impactful application. Future work is warranted in translating techniques into clinical tools.

Keywords: Chest X-ray; Computed tomography; Coronavirus; Diagnosis; Prognosis; Staging.

PubMed Disclaimer

Conflict of interest statement

The authors have no conflicts of interest to declare.

Figures

Fig. 1
Fig. 1
PRISMA flow diagram of search
Fig. 2
Fig. 2
Studies excluded for bias. The percentage of total studies that failed each of the required subset of the CLAIM checklist for inclusion (left), and a histogram of the number of failures (right), where only studies with 0 failures met the inclusion criteria
Fig. 3
Fig. 3
CLAIM results of studies included: the number of included studies that failed each of the CLAIM items (left), and a histogram of the number of failures (right)
Fig. 4
Fig. 4
(Left) Machine learning tasks attempted to be solved by techniques. (Top Right) A breakdown of Diagnosis and Diagnosis & Prognosis approaches by diagnostic outcome variable classes. (Bottom Right) A breakdown of Prognosis and Diagnosis & Prognosis approaches by prognostic outcome variable. The inner ring represents the number of classes, or continuous for regression tasks, and the outer ring represents the derivation of the outcome variable. See Table 1 for definitions of derivations
Fig. 5
Fig. 5
(Left) The distribution of modalities used for input to techniques. (Middle) The reported AUC and (Right) accuracy of techniques by modality. Only techniques reporting AUC or accuracy are included, respectively. Results of a two-sided independent t-test are give as ‘*’ for significance or ‘ns’ for no significance
Fig. 6
Fig. 6
(Left) The distribution of techniques using traditional machine learning and radiomics approaches versus deep learning and (Right) the distribution of the most popular deep learning networks
Fig. 7
Fig. 7
Performance of techniques, as measured by AUC (left) and accuracy (right), plotted against CLAIM failures. Hue represents tasks, as indicated in the legend. Dashed lines indicate the mean regression for each of the tasks, and shading indicates the 95% confidence interval. All regression lines were compared with a two-sided independent t-test against a null hypothesis that gradient = 0, none of which reached significance
Fig. 8
Fig. 8
Comparison of (Left) AUC, (middle) accuracy and (Right) number of CLAIM fails between techniques leveraging deep learning and those leveraging classical machine learning and radiomics approaches. Results of a two-sided independent t-test are represented as ‘*’ for significance or ‘ns’ for no significance
Fig. 9
Fig. 9
Number of articles published by author country. Articles with authors from multiple countries, indicated by hue, are counted in duplicate for each country
Fig. 10
Fig. 10
Authorship graph, where nodes represent authors and edges represent co-authorship. Depicted are the 5 largest clusters

References

    1. Karim SSA, Karim QA. Omicron SARS-CoV-2 variant: a new chapter in the COVID-19 pandemic. Lancet. 2021 doi: 10.1016/S0140-6736(21)02758-6. - DOI - PMC - PubMed
    1. Xie X, Zhong Z, Zhao W, Zheng C, Wang F, Liu J. Chest CT for typical coronavirus disease 2019 (COVID-19) pneumonia: relationship to negative RT-PCR testing. Radiology. 2020;296:E41–E45. doi: 10.1148/radiol.2020200343. - DOI - PMC - PubMed
    1. Kanne JP, Little BP, Chung JH, Elicker BM, Ketai LH. Essentials for radiologists on COVID-19: an update—radiology scientific expert panel. Radiology. 2020;296:E113–E114. doi: 10.1148/radiol.2020200527. - DOI - PMC - PubMed
    1. Simpson S, Kay FU, Abbara S, Bhalla S, Chung JH, Chung M, Henry TS, Kanne JP, Kligerman S, Ko JP, Litt H. Radiological Society of North America expert consensus document on reporting chest CT findings related to COVID-19: endorsed by the society of thoracic radiology, the American College of Radiology, and RSNA. Radiol Cardiothorac Imaging. 2020;2:e200152. doi: 10.1148/ryct.2020200152. - DOI - PMC - PubMed
    1. The Royal Australian and New Zealand College of Radiologists (2020) Guidelines for CT Chest and Chest Radiograph reporting in patients with suspected COVID-19 infection

Publication types

LinkOut - more resources