SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound

Christian F Baumgartner, Konstantinos Kamnitsas, Jacqueline Matthew, Tara P Fletcher, Sandra Smith, Lisa M Koch, Bernhard Kainz, Daniel Rueckert

PMID: 28708546
PMCID: PMC6051487
DOI: 10.1109/TMI.2017.2712367

SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound

Christian F Baumgartner et al. IEEE Trans Med Imaging. 2017 Nov.

. 2017 Nov;36(11):2204-2215.

doi: 10.1109/TMI.2017.2712367. Epub 2017 Jul 11.

Authors

Christian F Baumgartner, Konstantinos Kamnitsas, Jacqueline Matthew, Tara P Fletcher, Sandra Smith, Lisa M Koch, Bernhard Kainz, Daniel Rueckert

PMID: 28708546
PMCID: PMC6051487
DOI: 10.1109/TMI.2017.2712367

Abstract

Identifying and interpreting fetal standard scan planes during 2-D ultrasound mid-pregnancy examinations are highly complex tasks, which require years of training. Apart from guiding the probe to the correct location, it can be equally difficult for a non-expert to identify relevant structures within the image. Automatic image processing can provide tools to help experienced as well as inexperienced operators with these tasks. In this paper, we propose a novel method based on convolutional neural networks, which can automatically detect 13 fetal standard views in freehand 2-D ultrasound data as well as provide a localization of the fetal structures via a bounding box. An important contribution is that the network learns to localize the target anatomy using weak supervision based on image-level labels only. The network architecture is designed to operate in real-time while providing optimal output for the localization task. We present results for real-time annotation, retrospective frame retrieval from saved videos, and localization on a very large and challenging dataset consisting of images and video recordings of full clinical anomaly screenings. We found that the proposed method achieved an average F1-score of 0.798 in a realistic classification experiment modeling real-time detection, and obtained a 90.09% accuracy for retrospective frame retrieval. Moreover, an accuracy of 77.8% was achieved on the localization task.

PubMed Disclaimer

Figures

**Fig. 1.**
Overview of proposed SonoNet: (a) 2D fetal ultrasound data can be processed in real-time through our proposed convolutional neural network to determine if the current frame contains one of 13 fetal standard views (here the 4 chamber view (4CH) is shown); (b) if a standard view was detected, its location can be determined through a backward pass through the network.

**Fig. 7.**
Results of retrospective retrieval for two example subjects. The respective top rows show the ground truth (GT) saved by the sonographer. The bottom rows show the retrieved (RET) frames. For subject (a) all frames have been correctly retrieved. For subject (b) the frames marked with red have been incorrectly retrieved.

**Fig. 2.**
Overview of proposed network architectures. Each network consists of a feature extractor, an adaptation layer, and the final classification layer. All convolutional operations are denoted by squared brackets. Specifically, we use the following notation: [*kernelsize number of kernels* / *stride*]. The factor in front of the squared brackets indicates how many times this operation is repeated. Max-pooling is always performed with a kernel size of 22 and a stride of 2 and is denoted by MP. All convolutions are followed by a batch normalisation layer before the ReLu activation, except the SmallNet network, for which no batch normalisation was used.

formula image — **Fig. 2.**
Overview of proposed network architectures. Each network consists of a feature extractor, an adaptation layer, and the final classification layer. All convolutional operations are denoted by squared brackets. Specifically, we use the following notation: [*kernelsize number of kernels* / *stride*]. The factor in front of the squared brackets indicates how many times this operation is repeated. Max-pooling is always performed with a kernel size of 22 and a stride of 2 and is denoted by MP. All convolutions are followed by a batch normalisation layer before the ReLu activation, except the SmallNet network, for which no batch normalisation was used.

**Fig. 3.**
Examples of saliency maps. Column (a) shows three different input frames, (b) shows the corresponding class score maps obtained in the forward pass of the network, (c) shows saliency maps obtained using the method by Springenberg *et al.* and (d) shows the saliency maps resulting from our proposed method. Some of the unwanted saliency artefacts are highlighted with arrows in (c).

**Fig. 4.**
Examples of saliency map post-processing for two challenging views: (a) shows two input images, (b) shows the resulting confidence maps for those images, and (c) shows the resulting bounding boxes.

**Fig. 5.**
Class confusion matrix for SonoNet-32.

**Fig. 6.**
Examples of video frames labelled as background but classified as one of three standard views. The first three columns were randomly sampled from the set of false positives and are in fact correct detections. The last column shows manually selected true failure cases.

**Fig. 8.**
Examples of weakly supervised localisation using the SonoNet-32. The first three columns for each view show correct bounding boxes marked in green (), the respective last columns shows an example of an incorrect localisation marked in red (). The ground truth bounding boxes are shown in white.

See this image and copyright information in PMC

References

1. Abuhamad A., Falkensammer P., Reichartseder F., and Zhao Y., “Automated retrieval of standard diagnostic fetal cardiac ultrasound planes in the second trimester of pregnancy: A prospective evaluation of software,” Ultrasound Obstetrics, Gynecol., vol. 31, no. 1, pp. 30–36, 2008. - PubMed
1. Baumgartner C. F., Kamnitsas K., Matthew J., Smith S., Kainz B., and Rueckert D., “Real-time standard scan plane detection and localisation in fetal ultrasound using fully convolutional neural networks,” in Medical Image Computing and Computer-Assisted Intervention—MICCAI. Cham, Switzerland: Springer, 2016, pp. 203–211.
1. Boykov Y., Veksler O., and Zabih R., “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 11, pp. 1222–1239, Nov. 2001.
1. Bridge C. P., Ioannou C., and Noble J. A., “Automated annotation and quantitative description of ultrasound videos of the fetal heart,” Med. Image Anal., vol. 36, pp. 147–161, Feb. 2017. - PubMed
1. Bridge C. P. and Noble J. A., “Object localisation in fetal ultrasound images using invariant features,” in Proc. ISBI, Apr. 2015, pp. 156–159.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

102431/Wellcome Trust/United Kingdom

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound

SonoNet: Real-Time Detection and Localisation of Fetal Standard Scan Planes in Freehand Ultrasound

Authors

Abstract

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases