Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Mar 5:7:25.
doi: 10.3389/fcvm.2020.00025. eCollection 2020.

Deep Learning for Cardiac Image Segmentation: A Review

Affiliations
Review

Deep Learning for Cardiac Image Segmentation: A Review

Chen Chen et al. Front Cardiovasc Med. .

Abstract

Deep learning has become the most widely used approach for cardiac image segmentation in recent years. In this paper, we provide a review of over 100 cardiac image segmentation papers using deep learning, which covers common imaging modalities including magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound and major anatomical structures of interest (ventricles, atria, and vessels). In addition, a summary of publicly available cardiac image datasets and code repositories are included to provide a base for encouraging reproducible research. Finally, we discuss the challenges and limitations with current deep learning-based approaches (scarcity of labels, model generalizability across different domains, interpretability) and suggest potential directions for future research.

Keywords: CT; MRI; artificial intelligence; cardiac image analysis; cardiac image segmentation; deep learning; neural networks; ultrasound.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Overview of cardiac image segmentation tasks for different imaging modalities. For better understanding, we provide the anatomy of the heart on the left (image source: Wikimedia Commons, license: CC BY-SA 3.0). Of note, for simplicity, we list the tasks for which deep learning techniques have been applied, which will be discussed in section 3.
Figure 2
Figure 2
(A) Overview of numbers of papers published from 1st January 2016 to 1st August 2019 regarding deep learning-based methods for cardiac image segmentation reviewed in this work. (B) The increase of public data for cardiac image segmentation in the past 10 years. A list of publicly available datasets with detailed information is provided in Table 6. CT, computed tomography; MR, magnetic resonance.
Figure 3
Figure 3
(A) Generic architecture of convolutional neural networks (CNN). A CNN takes a cardiac MR image as input, learning hierarchical features through a stack of convolutions and pooling operations. These spatial feature maps are then flattened and reduced into a vector through fully connected layers. This vector can be in many forms, depending on the specific task. It can be probabilities for a set of classes (image classification) or coordinates of a bounding box (object localization) or a predicted label for the center pixel of the input (patch-based segmentation) or a real value for regression tasks (e.g., left ventricular volume estimation). (B) Patch-based segmentation method based on a CNN classifier. The CNN takes a patch as input and outputs the probabilities for four classes where the class with the highest score is the prediction for the center pixel (see the yellow cross) in this patch. By repeatedly forwarding patches located at different locations into the CNN for classification, one can finally get a pixel-wise segmentation map for the whole image. LV, left ventricle cavity; RV, right ventricle cavity; BG, Background; Myo, left ventricular myocardium. The blue number at the top indicates the number of channels of the feature maps. Here, each convolution kernel is a 3 × 3 kernel (stride = 1, padding = 1), which will produces an output feature map with the same height and width as the input.
Figure 4
Figure 4
(A) Architecture of a fully convolutional neural network (FCN). The FCN first takes the whole image as input, learns image features though the encoder, gradually recovers the spatial dimension by a series of upscaling layers (e.g., transposed convolution layers, unpooling layers) in the decoder and then produce 4-class pixel-wise probabilistic maps to predict regions of the left ventricle cavity (blue region), the left ventricular myocardium (green region) and the right ventricle cavity (red region) and background. The final segmentation map is obtained by assigning each pixel with the class of the highest probability. One use case of this FCN-based cardiac segmentation can be found in Tran (24). (B) Architecture of a U-net. On the basis of FCN, U-net adds “skip connections” (gray arrows) to aggregate feature maps from coarse to fine through concatenation and convolution operations. For simplicity, we reduce the number of downsampling and upsampling blocks in the diagram. For detailed information, we recommend readers to the original paper (49).
Figure 5
Figure 5
(A) Example of FCN with an RNN for cardiac image segmentation. The yellow block with a curved arrow represents a RNN module, which utilizes the knowledge learned from the past to make the current decision. In this example, the network is used to segment cardiac ventricles from a stack of 2D cardiac MR slices, which allows propagation of contextual information from adjacent slices for better inter-slice coherence (55). This type of RNN is also suitable for sequential data, such as cine MR images and ultrasound movies to learn temporal coherence. (B) Unfolded schema of the RNN module for visualizing the inner process when the input is a sequence of three images. Each time, this RNN module will receive an input i[t] at time step t, and produce an output o[t], considering not only the input information but also the hidden state (“memory”) h[t − 1] from the previous time step t−1.
Figure 6
Figure 6
A generic architecture of an autoencoder. An autoencoder employs an encoder-decoder structure, where the encoder maps the input data to a low-dimensional latent representation and the decoder interprets the code and reconstructs the input. The learned latent representation has been found effective for cardiac image segmentation (58, 59), cardiac shape modeling (60) and cardiac segmentation correction (61).
Figure 7
Figure 7
(A) Overview of GAN for image synthesis. (B) Overview of adversarial training for image segmentation.
Figure 8
Figure 8
(A) Naive version of the inception module (44). In this module, convolutional kernels with varying sizes are applied to the same input for multi-scale feature fusion. On the basis of the naive structure, a family of advanced inception modules with more complex structures have been developed (67, 68). (B) Schematic diagram of the attention module (69, 70). The attention module teaches the network to pay attention to important features (e.g., features relevant to anatomy) and ignore redundant features. (C) Schematic diagram of a residual unit (71). The yellow arrow represents a residual connection which is applied to reusing the features from a previous layer. The numbers in the green and orange blocks denote the sizes of corresponding convolutional or pooling kernels. Here, for simplicity, all diagrams have been reproduced based on the illustration in the original papers.

References

    1. Petitjean C, Zuluaga MA, Bai W, Dacher JN, Grosgeorge D, Caudron J, et al. . Right ventricle segmentation from cardiac MRI: a collation study. Med Image Anal. (2015) 19:187–202. 10.1016/j.media.2014.10.004 - DOI - PubMed
    1. Peng P, Lekadir K, Gooya A, Shao L, Petersen SE, Frangi AF. A review of heart chamber segmentation for structural and functional analysis using cardiac magnetic resonance imaging. Magn Reson Mater Phys Biol Med. (2016) 29:155–95. 10.1007/s10334-015-0521-4 - DOI - PMC - PubMed
    1. Tavakoli V, Amini AA. A survey of shaped-based registration and segmentation techniques for cardiac images. Comput Vis Image Understand. (2013) 117:966–89. 10.1016/j.cviu.2012.11.017 - DOI
    1. Lesage D, Angelini ED, Bloch I, Funka-Lea G. A review of 3D vessel lumen segmentation techniques: models, features and extraction schemes. Med Image Anal. (2009) 13:819–45. 10.1016/j.media.2009.07.011 - DOI - PubMed
    1. Greenspan H, Van Ginneken B, Summers RM. Guest editorial deep learning in medical imaging: overview and future promise of an exciting new technique. IEEE Trans Med Imaging. (2016) 35:1153–9. 10.1109/TMI.2016.2553401 - DOI

LinkOut - more resources