Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr;41(4):855-863.
doi: 10.1002/jum.15765. Epub 2021 Jun 16.

Deep Learning Pitfall: Impact of Novel Ultrasound Equipment Introduction on Algorithm Performance and the Realities of Domain Adaptation

Affiliations

Deep Learning Pitfall: Impact of Novel Ultrasound Equipment Introduction on Algorithm Performance and the Realities of Domain Adaptation

Michael Blaivas et al. J Ultrasound Med. 2022 Apr.

Abstract

Objectives: To test deep learning (DL) algorithm performance repercussions by introducing novel ultrasound equipment into a clinical setting.

Methods: Researchers introduced prospectively obtained inferior vena cava (IVC) videos from a similar patient population using novel ultrasound equipment to challenge a previously validated DL algorithm (trained on a common point of care ultrasound [POCUS] machine) to assess IVC collapse. Twenty-one new videos were obtained for each novel ultrasound machine. The videos were analyzed for complete collapse by the algorithm and by 2 blinded POCUS experts. Cohen's kappa was calculated for agreement between the 2 POCUS experts and DL algorithm. Previous testing showed substantial agreement between algorithm and experts with Cohen's kappa of 0.78 (95% CI 0.49-1.0) and 0.66 (95% CI 0.31-1.0) on new patient data using, the same ultrasound equipment.

Results: Challenged with higher image quality (IQ) POCUS cart ultrasound videos, algorithm performance declined with kappa values of 0.31 (95% CI 0.19-0.81) and 0.39 (95% CI 0.11-0.89), showing fair agreement. Algorithm performance plummeted on a lower IQ, smartphone device with a kappa value of -0.09 (95% CI -0.95 to 0.76) and 0.09 (95% CI -0.65 to 0.82), respectively, showing less agreement than would be expected by chance. Two POCUS experts had near perfect agreement with a kappa value of 0.88 (95% CI 0.64-1.0) regarding IVC collapse.

Conclusions: Performance of this previously validated DL algorithm worsened when faced with ultrasound studies from 2 novel ultrasound machines. Performance was much worse on images from a lower IQ hand-held device than from a superior cart-based device.

Keywords: artificial intelligence; deep learning; domain shift; inferior vena cava; pediatrics; point of care ultrasound.

PubMed Disclaimer

Similar articles

Cited by

References

    1. Safina A, Lau L, Brennan P, et al. Precision imaging-its impact on image quality and diagnostic confidence in breast ultrasound examinations. Br J Radiol 2015; 88:20140340.
    1. Birnholz J. Practice of ultrasound: part 9-image quality. 2013. www.auntminnie.com/. Accessed January 3, 2014.
    1. Lévêque L, Zhang W, Parker P, Liu H. The impact of specialty settings on the perceived quality of medical ultrasound video. IEEE Access. 2017; 5:16998-17005.
    1. Han X, Jovicich J, Salat D, et al. Reliability of mri-derived measurements of human cerebral cortical thickness: the effects of field strength, scanner upgrade and manufacturer. NeuroImage 2006; 32:180-194.
    1. Panayides AS, Amini A, Filipovic ND, et al. AI in medical imaging informatics: current challenges and future directions. IEEE J Biomed Health Inform 2020; 247:1837-1857.

LinkOut - more resources