Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 24;21(5):1570.
doi: 10.3390/s21051570.

Image Segmentation Using Encoder-Decoder with Deformable Convolutions

Affiliations

Image Segmentation Using Encoder-Decoder with Deformable Convolutions

Andreea Gurita et al. Sensors (Basel). .

Abstract

Image segmentation is an essential step in image analysis that brings meaning to the pixels in the image. Nevertheless, it is also a difficult task due to the lack of a general suited approach to this problem and the use of real-life pictures that can suffer from noise or object obstruction. This paper proposes an architecture for semantic segmentation using a convolutional neural network based on the Xception model, which was previously used for classification. Different experiments were made in order to find the best performances of the model (eg. different resolution and depth of the network and data augmentation techniques were applied). Additionally, the network was improved by adding a deformable convolution module. The proposed architecture obtained a 76.8 mean IoU on the Pascal VOC 2012 dataset and 58.1 on the Cityscapes dataset. It outperforms SegNet and U-Net networks, both networks having considerably more parameters and also a higher inference time.

Keywords: Xception model; convolutional neural network; deformable convolutions; image segmentation; mean intersection over union.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The proposed architecture, the decoding branch is almost symmetric to the encoding branch, except it reconstructs the values to the input image size using deconvolutions and upsampling layers.
Figure 2
Figure 2
The proposed architecture, with resolution 299 × 299 and different number of modules for the middle flow (between 2 and 8).
Figure 3
Figure 3
Proposed module based on ASPP and Adaptive Context Encoding (ACE).
Figure 4
Figure 4
Architecture of the best proposed model.
Figure 5
Figure 5
Segmentation result on classes on the Pascal VOC 2012 dataset [6].
Figure 6
Figure 6
Segmentation result on classes on the Cityscapes dataset [7].
Figure 7
Figure 7
Segmentation result on classes on the ADE20K dataset [8].

References

    1. Ozden M., Polat E. A color image segmentation approach for content based image retrieval. Pattern Recognit. 2007;40:1318–1325. doi: 10.1016/j.patcog.2006.08.013. - DOI
    1. Moeskops P., Viergever M.A., Mendrik A.M., Vries L.S., Benders M.J., Isgum I. Automatic Segmentation of MR Brain Images With a Convolutional Neural Network. IEEE Trans. Med. Imaging. 2016;35:1252–1261. doi: 10.1109/TMI.2016.2548501. - DOI - PubMed
    1. Image Segmentation. [(accessed on 1 February 2021)]; Available online: https://www.bioss.ac.uk/people/chris/ch4.pdf.
    1. Song Y., Yan H. Image Segmentation Techniques Overview; Proceedings of the Asia Modelling Symposium (AMS); Kota Kinabalu, Malaysia. 4–6 December 2017; pp. 103–107.
    1. Guo Y., Liu Y., Georgiou T., Lew M.S. A review of semantic segmentation using deep neural networks. Int. J. Multimed. Inf. Retr. 2018;7:87–93. doi: 10.1007/s13735-017-0141-z. - DOI

LinkOut - more resources