Multi-center validation of an artificial intelligence system for detection of COVID-19 on chest radiographs in symptomatic patients
- PMID: 35779089
- DOI: 10.1007/s00330-022-08969-z
Multi-center validation of an artificial intelligence system for detection of COVID-19 on chest radiographs in symptomatic patients
Abstract
Objectives: While chest radiograph (CXR) is the first-line imaging investigation in patients with respiratory symptoms, differentiating COVID-19 from other respiratory infections on CXR remains challenging. We developed and validated an AI system for COVID-19 detection on presenting CXR.
Methods: A deep learning model (RadGenX), trained on 168,850 CXRs, was validated on a large international test set of presenting CXRs of symptomatic patients from 9 study sites (US, Italy, and Hong Kong SAR) and 2 public datasets from the US and Europe. Performance was measured by area under the receiver operator characteristic curve (AUC). Bootstrapped simulations were performed to assess performance across a range of potential COVID-19 disease prevalence values (3.33 to 33.3%). Comparison against international radiologists was performed on an independent test set of 852 cases.
Results: RadGenX achieved an AUC of 0.89 on 4-fold cross-validation and an AUC of 0.79 (95%CI 0.78-0.80) on an independent test cohort of 5,894 patients. Delong's test showed statistical differences in model performance across patients from different regions (p < 0.01), disease severity (p < 0.001), gender (p < 0.001), and age (p = 0.03). Prevalence simulations showed the negative predictive value increases from 86.1% at 33.3% prevalence, to greater than 98.5% at any prevalence below 4.5%. Compared with radiologists, McNemar's test showed the model has higher sensitivity (p < 0.001) but lower specificity (p < 0.001).
Conclusion: An AI model that predicts COVID-19 infection on CXR in symptomatic patients was validated on a large international cohort providing valuable context on testing and performance expectations for AI systems that perform COVID-19 prediction on CXR.
Key points: • An AI model developed using CXRs to detect COVID-19 was validated in a large multi-center cohort of 5,894 patients from 9 prospectively recruited sites and 2 public datasets. • Differences in AI model performance were seen across region, disease severity, gender, and age. • Prevalence simulations on the international test set demonstrate the model's NPV is greater than 98.5% at any prevalence below 4.5%.
Keywords: Artificial intelligence; COVID-19; Public health; Radiology; Thoracic.
© 2022. The Author(s), under exclusive licence to European Society of Radiology.
References
-
- Gottlieb RL, Vaca CE, Paredes R et al (2021) Early remdesivir to prevent progression to severe Covid-19 in outpatients. N Engl J Med. https://doi.org/10.1056/NEJMoa2116846
-
- Kucharski AJ, Klepac P, Conlan AJK et al (2020) Effectiveness of isolation, testing, contact tracing, and physical distancing on reducing transmission of SARS-CoV-2 in different settings: a mathematical modelling study. Lancet Infect Dis 20:1151–1160 - DOI
-
- Dryden-Peterson S, Velásquez GE, Stopka TJ, Davey S, Lockman S, Ojikutu BO (2021) Disparities in SARS-CoV-2 testing in Massachusetts during the COVID-19 pandemic. JAMA Netw Open 4:e2037067 - DOI
-
- Quilty BJ, Clifford S, Hellewell J et al (2021) Quarantine and testing strategies in contact tracing for SARS-CoV-2: a modelling study. Lancet Public Health 6:e175–e183 - DOI
-
- Mina MJ, Parker R, Larremore DB (2020) Rethinking Covid-19 test sensitivity — a strategy for containment. N Engl J Med 383:e120 - DOI
MeSH terms
LinkOut - more resources
Full Text Sources
Medical
Miscellaneous
