U-NetPlus: A Modified Encoder-Decoder U-Net Architecture for Semantic and Instance Segmentation of Surgical Instruments from Laparoscopic Images
- PMID: 31947497
- PMCID: PMC7372295
- DOI: 10.1109/EMBC.2019.8856791
U-NetPlus: A Modified Encoder-Decoder U-Net Architecture for Semantic and Instance Segmentation of Surgical Instruments from Laparoscopic Images
Abstract
With the advent of robot-assisted surgery, there has been a paradigm shift in medical technology for minimally invasive surgery. However, it is very challenging to track the position of the surgical instruments in a surgical scene, and accurate detection & identification of surgical tools is paramount. Deep learning-based semantic segmentation in frames of surgery videos has the potential to facilitate this task. In this work, we modify the U-Net architecture by introducing a pre-trained encoder and re-design the decoder part, by replacing the transposed convolution operation with an upsampling operation based on nearest-neighbor (NN) interpolation. To further improve performance, we also employ a very fast and flexible data augmentation technique. We trained the framework on 8 × 225 frame sequences of robotic surgical videos available through the MICCAI 2017 EndoVis Challenge dataset and tested it on 8 × 75 frame and 2 × 300 frame videos. Using our U-NetPlus architecture, we report a 90.20% DICE for binary segmentation, 76.26% DICE for instrument part segmentation, and 46.07% for instrument type (i.e., all instruments) segmentation, outperforming the results of previous techniques implemented and tested on these data.
Figures





References
-
- MICCAI 2017 Endoscopic Vision Challenge: Robotic Instrument Segmentation Sub-Challenge, 2017, https://endovissub2017-roboticinstrumentsegmentation.grand-challenge.org....
-
- Buslaev EKVIIA, Parinov A and Kalinin AA, “Albumentations: fast and flexible image augmentations,” arXiv e-prints arXiv:1809.06839, 2018.
-
- Chen C, Chen Q, Xu J, and Koltun V, “Learning to see in the dark,” arXiv preprint arXiv:1805.01934, 2018.
-
- Dong C, Loy CC, and Tang X, “Accelerating the super-resolution convolutional neural network,” in European Conference on Computer Vision Springer, 2016, pp. 391–407.
-
- Fong RC and Vedaldi A, “Interpretable explanations of black boxes by meaningful perturbation,” arXiv preprint arXiv:1704.03296, 2017.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources