Deep Learning Pitfall: Impact of Novel Ultrasound Equipment Introduction on Algorithm Performance and the Realities of Domain Adaptation
- PMID: 34133034
- DOI: 10.1002/jum.15765
Deep Learning Pitfall: Impact of Novel Ultrasound Equipment Introduction on Algorithm Performance and the Realities of Domain Adaptation
Abstract
Objectives: To test deep learning (DL) algorithm performance repercussions by introducing novel ultrasound equipment into a clinical setting.
Methods: Researchers introduced prospectively obtained inferior vena cava (IVC) videos from a similar patient population using novel ultrasound equipment to challenge a previously validated DL algorithm (trained on a common point of care ultrasound [POCUS] machine) to assess IVC collapse. Twenty-one new videos were obtained for each novel ultrasound machine. The videos were analyzed for complete collapse by the algorithm and by 2 blinded POCUS experts. Cohen's kappa was calculated for agreement between the 2 POCUS experts and DL algorithm. Previous testing showed substantial agreement between algorithm and experts with Cohen's kappa of 0.78 (95% CI 0.49-1.0) and 0.66 (95% CI 0.31-1.0) on new patient data using, the same ultrasound equipment.
Results: Challenged with higher image quality (IQ) POCUS cart ultrasound videos, algorithm performance declined with kappa values of 0.31 (95% CI 0.19-0.81) and 0.39 (95% CI 0.11-0.89), showing fair agreement. Algorithm performance plummeted on a lower IQ, smartphone device with a kappa value of -0.09 (95% CI -0.95 to 0.76) and 0.09 (95% CI -0.65 to 0.82), respectively, showing less agreement than would be expected by chance. Two POCUS experts had near perfect agreement with a kappa value of 0.88 (95% CI 0.64-1.0) regarding IVC collapse.
Conclusions: Performance of this previously validated DL algorithm worsened when faced with ultrasound studies from 2 novel ultrasound machines. Performance was much worse on images from a lower IQ hand-held device than from a superior cart-based device.
Keywords: artificial intelligence; deep learning; domain shift; inferior vena cava; pediatrics; point of care ultrasound.
© 2021 American Institute of Ultrasound in Medicine.
References
-
- Safina A, Lau L, Brennan P, et al. Precision imaging-its impact on image quality and diagnostic confidence in breast ultrasound examinations. Br J Radiol 2015; 88:20140340.
-
- Birnholz J. Practice of ultrasound: part 9-image quality. 2013. www.auntminnie.com/. Accessed January 3, 2014.
-
- Lévêque L, Zhang W, Parker P, Liu H. The impact of specialty settings on the perceived quality of medical ultrasound video. IEEE Access. 2017; 5:16998-17005.
-
- Han X, Jovicich J, Salat D, et al. Reliability of mri-derived measurements of human cerebral cortical thickness: the effects of field strength, scanner upgrade and manufacturer. NeuroImage 2006; 32:180-194.
-
- Panayides AS, Amini A, Filipovic ND, et al. AI in medical imaging informatics: current challenges and future directions. IEEE J Biomed Health Inform 2020; 247:1837-1857.
MeSH terms
LinkOut - more resources
Full Text Sources