Semi-supervised Learning for Generalizable Intracranial Hemorrhage Detection and Segmentation

Emily Lin¹, Esther L Yuh¹

Affiliations

PMID: 38446043
PMCID: PMC11140498
DOI: 10.1148/ryai.230077

Semi-supervised Learning for Generalizable Intracranial Hemorrhage Detection and Segmentation

Emily Lin et al. Radiol Artif Intell. 2024 May.

. 2024 May;6(3):e230077.

doi: 10.1148/ryai.230077.

Authors

Emily Lin¹, Esther L Yuh¹

Affiliation

¹ From the Department of Radiology & Biomedical Imaging, University of California San Francisco, 185 Berry St, San Francisco CA 94107.

PMID: 38446043
PMCID: PMC11140498
DOI: 10.1148/ryai.230077

Abstract

Purpose To develop and evaluate a semi-supervised learning model for intracranial hemorrhage detection and segmentation on an out-of-distribution head CT evaluation set. Materials and Methods This retrospective study used semi-supervised learning to bootstrap performance. An initial "teacher" deep learning model was trained on 457 pixel-labeled head CT scans collected from one U.S. institution from 2010 to 2017 and used to generate pseudo labels on a separate unlabeled corpus of 25 000 examinations from the Radiological Society of North America and American Society of Neuroradiology. A second "student" model was trained on this combined pixel- and pseudo-labeled dataset. Hyperparameter tuning was performed on a validation set of 93 scans. Testing for both classification (n = 481 examinations) and segmentation (n = 23 examinations, or 529 images) was performed on CQ500, a dataset of 481 scans performed in India, to evaluate out-of-distribution generalizability. The semi-supervised model was compared with a baseline model trained on only labeled data using area under the receiver operating characteristic curve, Dice similarity coefficient, and average precision metrics. Results The semi-supervised model achieved a statistically significant higher examination area under the receiver operating characteristic curve on CQ500 compared with the baseline (0.939 [95% CI: 0.938, 0.940] vs 0.907 [95% CI: 0.906, 0.908]; P = .009). It also achieved a higher Dice similarity coefficient (0.829 [95% CI: 0.825, 0.833] vs 0.809 [95% CI: 0.803, 0.812]; P = .012) and pixel average precision (0.848 [95% CI: 0.843, 0.853]) vs 0.828 [95% CI: 0.817, 0.828]) compared with the baseline. Conclusion The addition of unlabeled data in a semi-supervised learning framework demonstrates stronger generalizability potential for intracranial hemorrhage detection and segmentation compared with a supervised baseline. Keywords: Semi-supervised Learning, Traumatic Brain Injury, CT, Machine Learning Supplemental material is available for this article. Published under a CC BY 4.0 license. See also the commentary by Swimburne in this issue.

Keywords: CT; Machine Learning; Semi-supervised Learning; Traumatic Brain Injury.

PubMed Disclaimer

Conflict of interest statement

Disclosures of conflicts of interest: E.L. No relevant relationships. E.L.Y. Salary from the National Institutes of Health and the Department of Defense paid to institution; royalties for a deep learning CT intracranial hemorrhage segmentation application; U.S. patent on deep learning for head computed tomography issued to institution.

Figures

Schematic of the semi-supervised noisy student approach. Each color
signifies data from a different institution. (A) Schematic shows the
workflow at training time, which is explained in further detail in the
Semi-supervised Algorithm Development section. (B) Schematic shows the
workflow at test time as the student model is evaluated on both the CQ500
overall dataset and pixel-label subset to evaluate examination-level and
pixel-level performances. — **Figure 1:**
Schematic of the semi-supervised noisy student approach. Each color signifies data from a different institution. **(A)** Schematic shows the workflow at training time, which is explained in further detail in the Semi-supervised Algorithm Development section. **(B)** Schematic shows the workflow at test time as the student model is evaluated on both the CQ500 overall dataset and pixel-label subset to evaluate examination-level and pixel-level performances.

(A–D) Images show visualization of false-positive predictions,
which are frequently made without the ranker. Green indicates the model
predictions. These axial CT images obtained without contrast material
administration are from the validation set. — **Figure 2:**
**(A–D)** Images show visualization of false-positive predictions, which are frequently made without the ranker. Green indicates the model predictions. These axial CT images obtained without contrast material administration are from the validation set.

(A–D) Images show visualization of model predictions on the
validation set using the baseline and semi-supervised (SS) models. Red is
the reference standard label, green is the model’s positive
prediction, and yellow is the overlap. These are axial CT images obtained
without contrast material administration. — **Figure 3:**
**(A–D)** Images show visualization of model predictions on the validation set using the baseline and semi-supervised (SS) models. Red is the reference standard label, green is the model’s positive prediction, and yellow is the overlap. These are axial CT images obtained without contrast material administration.

See this image and copyright information in PMC

Comment in

When the Student Becomes the Master: Boosting Intracranial Hemorrhage Detection Generalizability with Teacher-Student Learning.
Swinburne N. Swinburne N. Radiol Artif Intell. 2024 May;6(3):e240126. doi: 10.1148/ryai.240126. Radiol Artif Intell. 2024. PMID: 38597790 Free PMC article. No abstract available.

References

1. Titano JJ , Badgeley M , Schefflein J , et al. . Automated deep-neural-network surveillance of cranial images for acute neurologic events . Nat Med 2018. ; 24 ( 9 ): 1337 – 1341 . - PubMed
1. Chilamkurthy S , Ghosh R , Tanamala S , et al. . Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study . Lancet 2018. ; 392 ( 10162 ): 2388 – 2396 . - PubMed
1. Prevedello LM , Erdal BS , Ryu JL , et al. . Automated critical test findings identification and online notification system using artificial intelligence in imaging . Radiology 2017. ; 285 ( 3 ): 923 – 931 . - PubMed
1. Lee H , Yune S , Mansouri M , et al. . An explainable deep-learning algorithm for the detection of acute intracranial haemorrhage from small datasets . Nat Biomed Eng 2019. ; 3 ( 3 ): 173 – 182 . - PubMed
1. Chang PD , Kuoy E , Grinband J , et al. . Hybrid 3D/2D convolutional neural network for hemorrhage evaluation on head CT . AJNR Am J Neuroradiol 2018. ; 39 ( 9 ): 1609 – 1616 . - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Semi-supervised Learning for Generalizable Intracranial Hemorrhage Detection and Segmentation

Affiliation

Semi-supervised Learning for Generalizable Intracranial Hemorrhage Detection and Segmentation

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Comment in

References

MeSH terms

LinkOut - more resources

Full Text Sources