Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun;3(6):e340-e348.
doi: 10.1016/S2589-7500(21)00056-X. Epub 2021 Apr 20.

Deep learning to detect acute respiratory distress syndrome on chest radiographs: a retrospective study with external validation

Affiliations

Deep learning to detect acute respiratory distress syndrome on chest radiographs: a retrospective study with external validation

Michael W Sjoding et al. Lancet Digit Health. 2021 Jun.

Abstract

Background: Acute respiratory distress syndrome (ARDS) is a common, but under-recognised, critical illness syndrome associated with high mortality. An important factor in its under-recognition is the variability in chest radiograph interpretation for ARDS. We sought to train a deep convolutional neural network (CNN) to detect ARDS findings on chest radiographs.

Methods: CNNs were pretrained on 595 506 radiographs from two centres to identify common chest findings (eg, opacity and effusion), and then trained on 8072 radiographs annotated for ARDS by multiple physicians using various transfer learning approaches. The best performing CNN was tested on chest radiographs in an internal and external cohort, including a subset reviewed by six physicians, including a chest radiologist and physicians trained in intensive care medicine. Chest radiograph data were acquired from four US hospitals.

Findings: In an internal test set of 1560 chest radiographs from 455 patients with acute hypoxaemic respiratory failure, a CNN could detect ARDS with an area under the receiver operator characteristics curve (AUROC) of 0·92 (95% CI 0·89-0·94). In the subgroup of 413 images reviewed by at least six physicians, its AUROC was 0·93 (95% CI 0·88-0·96), sensitivity 83·0% (95% CI 74·0-91·1), and specificity 88·3% (95% CI 83·1-92·8). Among images with zero of six ARDS annotations (n=155), the median CNN probability was 11%, with six (4%) assigned a probability above 50%. Among images with six of six ARDS annotations (n=27), the median CNN probability was 91%, with two (7%) assigned a probability below 50%. In an external cohort of 958 chest radiographs from 431 patients with sepsis, the AUROC was 0·88 (95% CI 0·85-0·91). When radiographs annotated as equivocal were excluded, the AUROC was 0·93 (0·92-0·95).

Interpretation: A CNN can be trained to achieve expert physician-level performance in ARDS detection on chest radiographs. Further research is needed to evaluate the use of these algorithms to support real-time identification of ARDS patients to ensure fidelity with evidence-based care or to support ongoing ARDS research.

Funding: National Institutes of Health, Department of Defense, and Department of Veterans Affairs.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The University of Michigan has filed a US Utility Patent application (number 17/082,145) for the invention, University of Michigan IR number 2020–026, Computer vision technologies for rapid disease detection, which uses software technology to process chest radiographs to detect acute diseases, of which MWS, DT, CEG, and KRW report being coinventors, which is related to work reported in this Article. NJM reports fees paid to her institution by Quantum Leap. Healthcare Consortium, Biomark, Athersys, and The Marcus Foundation for work unrelated to the current Article. All other authors declare no competing interests.

Figures

Figure 1:
Figure 1:. CNN performance for identifying ARDS on chest radiographs compared to individual physician performance in the internal holdout test set
The deep CNN was compared with individual physicians in the subgroup of 413 chest radiographs that were each reviewed by at least six physicians, including a chest radiologist and physicians trained in intensive care medicine. Individual physician performance was determined using a reference standard that was derived based on ARDS annotations from the five other physicians reviewing the same radiograph. (A) CNN receiver operating characteristics curve plotted against individual physician TPR and FPR, and AUROC. (B) CNN precision-recall curve plotted against individual physician precision (PPV) and recall (sensitivity), and AUPRC. (C) CNN probability outputs for chest radiographs grouped by the number of physicians annotating each as ARDS. Boxplots show median, 25th and 75th percentile, and 1·5 × IQR. Dots represent points outside this range. CNN=convolutional neural network. ARDS=acute respiratory distress syndrome. AUROC=area under the receiver operator characteristic curve. AUPRC=area under the precision-recall curve. TPR=true positive rate. FPR=false positive rate. PPV=positive predictive value.
Figure 2:
Figure 2:. Visualising CNN activations in chest radiographs for error analysis in ARDS detection
Chest radiographs were grouped based on CNN probabilities of ARDS and physician ARDS annotations and then Grad-CAM was used to localise areas used by the CNN to identify ARDS within the radiographs. The heat map illustrates the importance of local areas within the image for classification. The importance value is scaled between 0 and 1 where a higher number indicates that the area is of higher importance for classifying the image as consistent with ARDS. (A) Chest radiographs annotated as ARDS by six of six physicians and assigned a high CNN probability. (B) Chest radiographs scored as consistent by six of six physicians but assigned a lower probability by the CNN. (C) Chest radiographs annotated as ARDS by zero of six physicians but assigned a high probability by the CNN. (D) Chest radiographs with disagreement among physicians (three of six physicians annotating ARDS) and assigned a high probability by the CNN. CNN=convolutional neural network. ARDS=acute respiratory distress syndrome. Grad-CAM=gradient-weighted class activation mapping. P(ARDS)=probability that the chest radiograph is consistent with ARDS.
Figure 3:
Figure 3:. CNN performance for identifying ARDS on chest radiographs by patient subgroups
Race categories were self-reported. Error bars represent 95% CI estimates of the AUROC. Race category other includes patients who are Asian, American Indian, Native Alaskan, Native Hawaiian, other Pacific Islander, or unknown race. CNN=convolutional neural network. ARDS=acute respiratory distress syndrome. AUROC=area under the receiver operator characteristics curve. BMI=body-mass index.
Figure 4:
Figure 4:. CNN performance for identifying ARDS on chest radiographs in an external test set
Receiver operator curve which is the plot of the TPR and FPR (A), and precision-recall curve which is the plot of the PPV and the model sensitivity (B), and probability outputs from the CNN across chest radiograph annotation categories, showing median, 25th and 75th percentile, and 1·5 × IQR (C). Dots represent points outside this range. CNN=convolutional neural network. ARDS=acute respiratory distress syndrome. AUROC=area under the receiver operator characteristic curve. AUPRC=area under the precision-recall curve. TPR=true positive rate. FPR=false positive rate. PPV=positive predictive value.

References

    1. Fan E, Del Sorbo L, Goligher EC, et al. An official American Thoracic Society/European Society of Intensive Care Medicine/Society of Critical Care Medicine clinical practice guideline: mechanical ventilation in adult patients with acute respiratory distress syndrome. Am J Respir Crit Care Med 2017; 195: 1253–63. - PubMed
    1. Bellani G, Laffey JG, Pham T, et al. Epidemiology, patterns of care, and mortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries. JAMA 2016; 315: 788–800. - PubMed
    1. Weiss CH, Baker DW, Weiner S, et al. Low tidal volume ventilation use in acute respiratory distress syndrome. Crit Care Med 2016;44: 1515–22. - PMC - PubMed
    1. Sjoding MW, Hofer TP, Co I, Courey A, Cooke CR, Iwashyna TJ. Interobserver reliability of the Berlin ARDS definition and strategies to improve the reliability of ARDS diagnosis. Chest 2018; 153: 361–67. - PMC - PubMed
    1. Peng JM, Qian CY, Yu XY, et al. Does training improve diagnostic accuracy and inter-rater agreement in applying the Berlin radiographic definition of acute respiratory distress syndrome? A multicenter prospective study. Crit Care 2017; 21: 12. - PMC - PubMed

Publication types

MeSH terms