Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study
- PMID: 34219054
- DOI: 10.1016/S2589-7500(21)00106-0
Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study
Abstract
Background: Chest x-rays are widely used in clinical practice; however, interpretation can be hindered by human error and a lack of experienced thoracic radiologists. Deep learning has the potential to improve the accuracy of chest x-ray interpretation. We therefore aimed to assess the accuracy of radiologists with and without the assistance of a deep-learning model.
Methods: In this retrospective study, a deep-learning model was trained on 821 681 images (284 649 patients) from five data sets from Australia, Europe, and the USA. 2568 enriched chest x-ray cases from adult patients (≥16 years) who had at least one frontal chest x-ray were included in the test dataset; cases were representative of inpatient, outpatient, and emergency settings. 20 radiologists reviewed cases with and without the assistance of the deep-learning model with a 3-month washout period. We assessed the change in accuracy of chest x-ray interpretation across 127 clinical findings when the deep-learning model was used as a decision support by calculating area under the receiver operating characteristic curve (AUC) for each radiologist with and without the deep-learning model. We also compared AUCs for the model alone with those of unassisted radiologists. If the lower bound of the adjusted 95% CI of the difference in AUC between the model and the unassisted radiologists was more than -0·05, the model was considered to be non-inferior for that finding. If the lower bound exceeded 0, the model was considered to be superior.
Findings: Unassisted radiologists had a macroaveraged AUC of 0·713 (95% CI 0·645-0·785) across the 127 clinical findings, compared with 0·808 (0·763-0·839) when assisted by the model. The deep-learning model statistically significantly improved the classification accuracy of radiologists for 102 (80%) of 127 clinical findings, was statistically non-inferior for 19 (15%) findings, and no findings showed a decrease in accuracy when radiologists used the deep-learning model. Unassisted radiologists had a macroaveraged mean AUC of 0·713 (0·645-0·785) across all findings, compared with 0·957 (0·954-0·959) for the model alone. Model classification alone was significantly more accurate than unassisted radiologists for 117 (94%) of 124 clinical findings predicted by the model and was non-inferior to unassisted radiologists for all other clinical findings.
Interpretation: This study shows the potential of a comprehensive deep-learning model to improve chest x-ray interpretation across a large breadth of clinical practice.
Funding: Annalise.ai.
Copyright © 2021 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.
Conflict of interest statement
Declaration of interests This study was funded by Annalise.ai. JCYS, CHMT, QDB, XGH, JBW, AA, HA, HP, JFL, BH, SJFH, BPJ, LO-R, PB, and CMJ were employed by or seconded to Annalise.ai and report personal fees from Annalise.ai during the study and outside the submitted work. CMJ was employed by I-MED. All other authors declare no competing interests.
Comment in
-
A comprehensive deep-learning model for interpreting chest x-rays.Lancet Digit Health. 2022 Jan;4(1):e6. doi: 10.1016/S2589-7500(21)00255-7. Lancet Digit Health. 2022. PMID: 34952677 No abstract available.
-
Beyond the AJR: Potential of Deep Learning Image Classification for Chest Radiography.AJR Am J Roentgenol. 2022 Apr;218(4):762. doi: 10.2214/AJR.21.26796. Epub 2021 Sep 15. AJR Am J Roentgenol. 2022. PMID: 35603514 No abstract available.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
