. 2021 Dec 7;11(12):e053024.

doi: 10.1136/bmjopen-2021-053024.

Do comprehensive deep learning algorithms suffer from hidden stratification? A retrospective study on pneumothorax detection in chest radiography

Jarrel Seah^{1

2}, Cyril Tang², Quinlan D Buchlak^{2

3}, Michael Robert Milne^{2

2}, Xavier Holt², Hassan Ahmad², John Lambert², Nazanin Esmaili^{3

4}, Luke Oakden-Rayner⁵, Peter Brotchie^{2

6}, Catherine M Jones^{2

7}

Affiliations

¹ Radiology, Alfred Health, Melbourne, Victoria, Australia jarrel.seah@annalise.ai.
² annalise.ai, Sydney, New South Wales, Australia.
³ University of Notre Dame Australia, Sydney, New South Wales, Australia.
⁴ University of Technology Sydney, Sydney, New South Wales, Australia.
⁵ Australian Institute for Machine Learning, The University of Adelaide, Adelaide, South Australia, Australia.
⁶ Radiology, St Vincent's Hospital Melbourne Pty Ltd, Fitzroy, Victoria, Australia.
⁷ I-MED Radiology, Brisbane, Queensland, Australia.

PMID: 34876430
PMCID: PMC8655590
DOI: 10.1136/bmjopen-2021-053024

Do comprehensive deep learning algorithms suffer from hidden stratification? A retrospective study on pneumothorax detection in chest radiography

Jarrel Seah et al. BMJ Open. 2021.

. 2021 Dec 7;11(12):e053024.

doi: 10.1136/bmjopen-2021-053024.

Authors

Affiliations

¹ Radiology, Alfred Health, Melbourne, Victoria, Australia jarrel.seah@annalise.ai.
² annalise.ai, Sydney, New South Wales, Australia.
³ University of Notre Dame Australia, Sydney, New South Wales, Australia.
⁴ University of Technology Sydney, Sydney, New South Wales, Australia.
⁵ Australian Institute for Machine Learning, The University of Adelaide, Adelaide, South Australia, Australia.
⁶ Radiology, St Vincent's Hospital Melbourne Pty Ltd, Fitzroy, Victoria, Australia.
⁷ I-MED Radiology, Brisbane, Queensland, Australia.

PMID: 34876430
PMCID: PMC8655590
DOI: 10.1136/bmjopen-2021-053024

Abstract

Objectives: To evaluate the ability of a commercially available comprehensive chest radiography deep convolutional neural network (DCNN) to detect simple and tension pneumothorax, as stratified by the following subgroups: the presence of an intercostal drain; rib, clavicular, scapular or humeral fractures or rib resections; subcutaneous emphysema and erect versus non-erect positioning. The hypothesis was that performance would not differ significantly in each of these subgroups when compared with the overall test dataset.

Design: A retrospective case-control study was undertaken.

Setting: Community radiology clinics and hospitals in Australia and the USA.

Participants: A test dataset of 2557 chest radiography studies was ground-truthed by three subspecialty thoracic radiologists for the presence of simple or tension pneumothorax as well as each subgroup other than positioning. Radiograph positioning was derived from radiographer annotations on the images.

Outcome measures: DCNN performance for detecting simple and tension pneumothorax was evaluated over the entire test set, as well as within each subgroup, using the area under the receiver operating characteristic curve (AUC). A difference in AUC of more than 0.05 was considered clinically significant.

Results: When compared with the overall test set, performance of the DCNN for detecting simple and tension pneumothorax was statistically non-inferior in all subgroups. The DCNN had an AUC of 0.981 (0.976-0.986) for detecting simple pneumothorax and 0.997 (0.995-0.999) for detecting tension pneumothorax.

Conclusions: Hidden stratification has significant implications for potential failures of deep learning when applied in clinical practice. This study demonstrated that a comprehensively trained DCNN can be resilient to hidden stratification in several clinically meaningful subgroups in detecting pneumothorax.

Keywords: accident & emergency medicine; chest imaging; health informatics.

PubMed Disclaimer

Conflict of interest statement

Competing interests: All authors have reviewed and approved this manuscript. Authors JS, CT, QDB, MRM, XH, HA, JL, PB and CMJ are employees of, or are seconded to, Annalise.ai. NE and LO-R have no interests to declare.

Figures

**Figure 1**
Difference in AUC for detecting simple pneumothorax in the test dataset versus each specific subgroup with adjusted 95% CI. AUC, area under the receiver operating characteristic curve.

**Figure 2**
Difference in AUC for detecting tension pneumothorax in the test dataset versus each specific subgroup with adjusted 95% CI. AUC, area under the receiver operating characteristic curve.

See this image and copyright information in PMC

Cited by

Deep learning for pneumothorax diagnosis: a systematic review and meta-analysis.
Sugibayashi T, Walston SL, Matsumoto T, Mitsuyama Y, Miki Y, Ueda D. Sugibayashi T, et al. Eur Respir Rev. 2023 Jun 7;32(168):220259. doi: 10.1183/16000617.0259-2022. Print 2023 Jun 30. Eur Respir Rev. 2023. PMID: 37286217 Free PMC article.
Radiomics-based decision support tool assists radiologists in small lung nodule classification and improves lung cancer early diagnosis.
Hunter B, Argyros C, Inglese M, Linton-Reid K, Pulzato I, Nicholson AG, Kemp SV, L Shah P, Molyneaux PL, McNamara C, Burn T, Guilhem E, Mestas Nuñez M, Hine J, Choraria A, Ratnakumar P, Bloch S, Jordan S, Padley S, Ridge CA, Robinson G, Robbie H, Barnett J, Silva M, Desai S, Lee RW, Aboagye EO, Devaraj A. Hunter B, et al. Br J Cancer. 2023 Dec;129(12):1949-1955. doi: 10.1038/s41416-023-02480-y. Epub 2023 Nov 6. Br J Cancer. 2023. PMID: 37932513 Free PMC article.
Deep learning for tubes and lines detection in critical illness: Generalizability and comparison with residents.
Wongveerasin P, Tongdee T, Saiviroonporn P. Wongveerasin P, et al. Eur J Radiol Open. 2024 Jul 29;13:100593. doi: 10.1016/j.ejro.2024.100593. eCollection 2024 Dec. Eur J Radiol Open. 2024. PMID: 39175597 Free PMC article.
Better performance of deep learning pulmonary nodule detection using chest radiography with pixel level labels in reference to computed tomography: data quality matters.
Kim JY, Ryu WS, Kim D, Kim EY. Kim JY, et al. Sci Rep. 2024 Jul 10;14(1):15967. doi: 10.1038/s41598-024-66530-y. Sci Rep. 2024. PMID: 38987309 Free PMC article.
Analysis of Line and Tube Detection Performance of a Chest X-ray Deep Learning Model to Evaluate Hidden Stratification.
Tang CHM, Seah JCY, Ahmad HK, Milne MR, Wardman JB, Buchlak QD, Esmaili N, Lambert JF, Jones CM. Tang CHM, et al. Diagnostics (Basel). 2023 Jul 9;13(14):2317. doi: 10.3390/diagnostics13142317. Diagnostics (Basel). 2023. PMID: 37510062 Free PMC article.

See all "Cited by" articles

References

1. Khan A, Sohail A, Zahoora U, et al. . A survey of the recent architectures of deep convolutional neural networks. Artif Intell Rev 2020;53:5455–516. 10.1007/s10462-020-09825-6 - DOI
1. Rawat W, Wang Z. Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput 2017;29:2352–449. 10.1162/neco_a_00990 - DOI - PubMed
1. Rajpurkar P, Irvin J, Ball RL, et al. . Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med 2018;15:e1002686. 10.1371/journal.pmed.1002686 - DOI - PMC - PubMed
1. Sarvamangala DR, Kulkarni RV. Convolutional neural networks in medical image understanding: a survey. Evol Intell 2021;1:3. 10.1007/s12065-020-00540-3 - DOI - PMC - PubMed
1. Aggarwal R, Sounderajah V, Martin G, et al. . Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. npj Digit Med 2021;4:1–23. 10.1038/s41746-021-00438-z - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Do comprehensive deep learning algorithms suffer from hidden stratification? A retrospective study on pneumothorax detection in chest radiography

Affiliations

Do comprehensive deep learning algorithms suffer from hidden stratification? A retrospective study on pneumothorax detection in chest radiography

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical