Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov;36(11):2204-2215.
doi: 10.1109/TMI.2017.2712367. Epub 2017 Jul 11.

SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound

SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound

Christian F Baumgartner et al. IEEE Trans Med Imaging. 2017 Nov.

Abstract

Identifying and interpreting fetal standard scan planes during 2-D ultrasound mid-pregnancy examinations are highly complex tasks, which require years of training. Apart from guiding the probe to the correct location, it can be equally difficult for a non-expert to identify relevant structures within the image. Automatic image processing can provide tools to help experienced as well as inexperienced operators with these tasks. In this paper, we propose a novel method based on convolutional neural networks, which can automatically detect 13 fetal standard views in freehand 2-D ultrasound data as well as provide a localization of the fetal structures via a bounding box. An important contribution is that the network learns to localize the target anatomy using weak supervision based on image-level labels only. The network architecture is designed to operate in real-time while providing optimal output for the localization task. We present results for real-time annotation, retrospective frame retrieval from saved videos, and localization on a very large and challenging dataset consisting of images and video recordings of full clinical anomaly screenings. We found that the proposed method achieved an average F1-score of 0.798 in a realistic classification experiment modeling real-time detection, and obtained a 90.09% accuracy for retrospective frame retrieval. Moreover, an accuracy of 77.8% was achieved on the localization task.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Overview of proposed SonoNet: (a) 2D fetal ultrasound data can be processed in real-time through our proposed convolutional neural network to determine if the current frame contains one of 13 fetal standard views (here the 4 chamber view (4CH) is shown); (b) if a standard view was detected, its location can be determined through a backward pass through the network.
Fig. 7.
Fig. 7.
Results of retrospective retrieval for two example subjects. The respective top rows show the ground truth (GT) saved by the sonographer. The bottom rows show the retrieved (RET) frames. For subject (a) all frames have been correctly retrieved. For subject (b) the frames marked with red have been incorrectly retrieved.
Fig. 2.
Fig. 2.
Overview of proposed network architectures. Each network consists of a feature extractor, an adaptation layer, and the final classification layer. All convolutional operations are denoted by squared brackets. Specifically, we use the following notation: [kernelsize formula image number of kernels / stride]. The factor in front of the squared brackets indicates how many times this operation is repeated. Max-pooling is always performed with a kernel size of 2formula image2 and a stride of 2 and is denoted by MP. All convolutions are followed by a batch normalisation layer before the ReLu activation, except the SmallNet network, for which no batch normalisation was used.
Fig. 3.
Fig. 3.
Examples of saliency maps. Column (a) shows three different input frames, (b) shows the corresponding class score maps formula image obtained in the forward pass of the network, (c) shows saliency maps obtained using the method by Springenberg et al. and (d) shows the saliency maps resulting from our proposed method. Some of the unwanted saliency artefacts are highlighted with arrows in (c).
Fig. 4.
Fig. 4.
Examples of saliency map post-processing for two challenging views: (a) shows two input images, (b) shows the resulting confidence maps for those images, and (c) shows the resulting bounding boxes.
Fig. 5.
Fig. 5.
Class confusion matrix for SonoNet-32.
Fig. 6.
Fig. 6.
Examples of video frames labelled as background but classified as one of three standard views. The first three columns were randomly sampled from the set of false positives and are in fact correct detections. The last column shows manually selected true failure cases.
Fig. 8.
Fig. 8.
Examples of weakly supervised localisation using the SonoNet-32. The first three columns for each view show correct bounding boxes marked in green (formula image), the respective last columns shows an example of an incorrect localisation marked in red (formula image). The ground truth bounding boxes are shown in white.

References

    1. Abuhamad A., Falkensammer P., Reichartseder F., and Zhao Y., “Automated retrieval of standard diagnostic fetal cardiac ultrasound planes in the second trimester of pregnancy: A prospective evaluation of software,” Ultrasound Obstetrics, Gynecol., vol. 31, no. 1, pp. 30–36, 2008. - PubMed
    1. Baumgartner C. F., Kamnitsas K., Matthew J., Smith S., Kainz B., and Rueckert D., “Real-time standard scan plane detection and localisation in fetal ultrasound using fully convolutional neural networks,” in Medical Image Computing and Computer-Assisted Intervention—MICCAI. Cham, Switzerland: Springer, 2016, pp. 203–211.
    1. Boykov Y., Veksler O., and Zabih R., “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 11, pp. 1222–1239, Nov. 2001.
    1. Bridge C. P., Ioannou C., and Noble J. A., “Automated annotation and quantitative description of ultrasound videos of the fetal heart,” Med. Image Anal., vol. 36, pp. 147–161, Feb. 2017. - PubMed
    1. Bridge C. P. and Noble J. A., “Object localisation in fetal ultrasound images using invariant features,” in Proc. ISBI, Apr. 2015, pp. 156–159.