Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2018 Jul:47:127-139.
doi: 10.1016/j.media.2018.04.004. Epub 2018 Apr 23.

VP-Nets : Efficient automatic localization of key brain structures in 3D fetal neurosonography

Affiliations
Comparative Study

VP-Nets : Efficient automatic localization of key brain structures in 3D fetal neurosonography

Ruobing Huang et al. Med Image Anal. 2018 Jul.

Abstract

Three-dimensional (3D) fetal neurosonography is used clinically to detect cerebral abnormalities and to assess growth in the developing brain. However, manual identification of key brain structures in 3D ultrasound images requires expertise to perform and even then is tedious. Inspired by how sonographers view and interact with volumes during real-time clinical scanning, we propose an efficient automatic method to simultaneously localize multiple brain structures in 3D fetal neurosonography. The proposed View-based Projection Networks (VP-Nets), uses three view-based Convolutional Neural Networks (CNNs), to simplify 3D localizations by directly predicting 2D projections of the key structures onto three anatomical views. While designed for efficient use of data and GPU memory, the proposed VP-Nets allows for full-resolution 3D prediction. We investigated parameters that influence the performance of VP-Nets, e.g. depth and number of feature channels. Moreover, we demonstrate that the model can pinpoint the structure in 3D space by visualizing the trained VP-Nets, despite only 2D supervision being provided for a single stream during training. For comparison, we implemented two other baseline solutions based on Random Forest and 3D U-Nets. In the reported experiments, VP-Nets consistently outperformed other methods on localization. To test the importance of loss function, two identical models are trained with binary corss-entropy and dice coefficient loss respectively. Our best VP-Net model achieved prediction center deviation: 1.8 ± 1.4 mm, size difference: 1.9 ± 1.5 mm, and 3D Intersection Over Union (IOU): 63.2 ± 14.7% when compared to the ground truth. To make the whole pipeline intervention free, we also implement a skull-stripping tool using 3D CNN, which achieves high segmentation accuracy. As a result, the proposed processing pipeline takes a raw ultrasound brain image as input, and output a skull-stripped image with five detected key brain structures.

Keywords: 3D Structure detection; Convolutional neural networks; Fetal brain volume; Ultrasound.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) Schematic of the detected structures in the right-hemisphere of the brain in 3D. Each structure is plotted as: cavum septi pellucidi (CSP) in blue, thalami (Tha) in yellow, lateral ventricles (LV) in green, cerebellum (CE) in red and cisterna magna (CM) in purple. (b) 3D spatial configuration of the bounding boxes of the target brain structures. Each box is shown as a 3D cube whose color is in accord with that in sub-figure a. Notice that there is geometric overlap between CE and CM, as well as CSP and Tha. (c) Targeted structures in sagittal planes of a US volume. The structures is bounded by white dashed boxes.
Figure 2
Figure 2
Networks design details: (a) An overview of the CNN-based image analysis pipeline. A raw ultrasound volume is preprocessed using our skull-stripping tool. The masked brain image and two of its transformed (90° rotation) volume are passed, in parallel, to three independent VP-Nets. Each VP-Net outputs the projection mask of five structures (15 projected masks for 5 structures). The minimal rectangle that encloses the prediction silhouette is detected for each structure in each view, to generate the bounding box parameters. The 3D bounding box for a structure can be reconstructed with two obtained rectangle masks with backward projection. A graphical demo for the VP-Net model is available at: https://youtu.be/KVxkbqWYWxc. (b) Each individual VP-Net shares the same spirit as 2D U-Net (Ronneberger et al., 2015) for full-resolution prediction. In VP-Net, convolution filters in all layers only scan along the 2D X-Y plane, while always penetrating along the Z-axis to capture contextual information along that dimension.
Figure 3
Figure 3. View-based Projections
(a) Cerebellum (CE) in different views. Yellow boxes refer to the cross section of 3D human expert annotation. (b) Schematic of orthogonal projection. The cerebellum is highlighted in purple and bounded by a gray cube. Its orthogonal projections on axial plane (horizontal) and sagittal plane (vertical) are shown respectively. The combination of the two rectangles defines the 3D bounding box of the object.
Figure 4
Figure 4
Rectangle detection denotation. The prediction mask T is plot in solid blue, and its contour is plotted in dashed blue line. The centroid p(xc, yc), the orientation ϕ, the width 2dam and the height 2dbm are labelled respectively. The black box denotes the detected minimal rectangle.
Figure 5
Figure 5
Back-ward projection. The red and blue are rectangle masks detected from two different views. The 3D bounding box can be obtained by intersection of their projection (dark purple).
Figure 6
Figure 6
Example of predicted bounding boxes of targeted structure in different anatomical views. Yellow box show the ground truth, and Red boxes are predictions. First to last row: Results from Random Forest, 3D U-Net, VP-Nets. First to last column: Results from Axial, Coronal, Sagital views. Note that there are overlap between CE and CM in all the views (as shown in a and b). Tha coincides with CE and CM in coronal view as well. The results show the prediction of VP-Nets coincided better with ground truth than that of 3D U-nets and random forest.
Figure 7
Figure 7
Mean 2D & 3D IOU curves of 3D U-Nets (yellow) and VP-Nets (other colors). b.shows 3D U-Nets perform similarly with other VP-Nets on coronal view in 2D IOU. All VP-Nets achieves higher accuracy when combine the three views together (e). Comparing different variants of VP-Nets, Model F(black) achieves the highest accuracy. It shows that larger feature channel and deeper layers increase the network performance.
Figure 8
Figure 8
Visualization of a 3D saliency volume derived from CE detection in the axial view. The right-hand figure shows the points with top 2% gradient values in 3D. The color red represents higher gradient values, indicating a higher influence on the prediction outcome. Sub-figure (a)(b) shows a axial and a sagittal US slice with 3D saliency points projected on each view respectively. The images and the original US are zoomed in for detail visualization.
Figure 9
Figure 9
Challenging case of CE in test sets. (a) shows the schematic of anatomical configuration of CE in axial plane. The white eyeglass-shaped structure is the CE. (b) is the US slice of the discussed case. Half of the CE is blocked by acoustic shadows. (c) shows the ground truth bounding box of CE (overlaid in yellow). The confidence map predicted by VP-Nets F is shown as red in (d). In this case, IoU of the raw prediction map and the ground truth is 68%.
Figure 10
Figure 10
Challenging case of CC in test sets. (a) is the schematic of CC in mid-sagittal plane. The CC is shown as the blue comma-shape structure. The corresponding mid-sagittal slice of the US volume is given in (b). The ground truth is plotted as transparent yellow rectangle in (c). Similarly, the predicted confidence map is shown as red rectangle in (d).
Figure 11
Figure 11
Challenging case of LV in test sets. (a) is the schematic of LV in axial plane. (b), (d) show the corresponding axial image of the third and the forth case respectively. No ground truth is displayed for (b) as the human expert reckoned LV is not visible in this case. (e) show the ground truth label of the LV for the forth case. The predicted confidence map is superimposed on the corresponding ultrasound image in (c) and (f).

References

    1. Archibald SL, Fennema-Notestine C, Gamst A, Riley EP, Mattson SN, Jernigan TL. Brain dysmorphology in individuals with severe prenatal alcohol exposure. Developmental Medicine & Child Neurology. 2001;43(03):148–154. - PubMed
    1. Arısoy R, Yayla M. Nomogram of fetal cisterna magna width at 15–24th gestational weeks. Perinatal Journal 2010
    1. Baumgartner CF, Kamnitsas K, Matthew J, Fletcher TP, Smith S, Koch LM, Kainz B, Rueckert D. Sononet: Real-time detection and localisation of fetal standard scan planes in freehand ultrasound. IEEE Transactions on Medical Imaging. 2017 Nov;36(11):2204–2215. - PMC - PubMed
    1. Carroll S, Porter H, Abdel-Fattah S, Kyle P, Soothill P. Correlation of prenatal ultrasound diagnosis and pathologic findings in fetal brain abnormalities. Ultrasound in obstetrics & gynecology. 2000;16(2):149–153. - PubMed
    1. Chang T, Robson SC, Spencer JA, Gallivan S. Ultrasonic fetal weight estimation: Analysis of inter-and intra-observer variability. Journal of clinical ultrasound. 1993;21(8):515–519. - PubMed

Publication types