Self-supervised Monocular Depth Estimation with 3D Displacement Module for Laparoscopic Images

Chi Xu¹, Baoru Huang¹, Daniel S Elson¹

Affiliations

PMID: 36148138
PMCID: PMC7613618
DOI: 10.1109/TMRB.2022.3170206

Self-supervised Monocular Depth Estimation with 3D Displacement Module for Laparoscopic Images

Chi Xu et al. IEEE Trans Med Robot Bionics. 2022 May.

. 2022 May;4(2):331-334.

doi: 10.1109/TMRB.2022.3170206.

Authors

Chi Xu¹, Baoru Huang¹, Daniel S Elson¹

Affiliation

¹ The Hamlyn Centre for Robotic Surgery, Department of Surgery and Cancer, Imperial College London, London SW7 2AZ, UK.

PMID: 36148138
PMCID: PMC7613618
DOI: 10.1109/TMRB.2022.3170206

Abstract

We present a novel self-supervised training framework with 3D displacement (3DD) module for accurately estimating per-pixel depth maps from single laparoscopic images. Recently, several self-supervised learning based monocular depth estimation models have achieved good results on the KITTI dataset, under the hypothesis that the camera is dynamic and the objects are stationary, however this hypothesis is often reversed in the surgical setting (laparoscope is stationary, the surgical instruments and tissues are dynamic). Therefore, a 3DD module is proposed to establish the relation between frames instead of ego-motion estimation. In the 3DD module, a convolutional neural network (CNN) analyses source and target frames to predict the 3D displacement of a 3D point cloud from a target frame to a source frame in the coordinates of the camera. Since it is difficult to constrain the depth displacement from two 2D images, a novel depth consistency module is proposed to maintain depth consistency between displacement-updated depth and model-estimated depth to constrain 3D displacement effectively. Our proposed method achieves remarkable performance for monocular depth estimation on the Hamlyn surgical dataset and acquired ground truth depth maps, outperforming monodepth, monodepth2 and packnet models.

Keywords: 3D displacement; CNN; Deep learning; monocular depth estimation; self-supervised learning.

PubMed Disclaimer

Figures

**Fig. 1. Framework architecture.**
The Resnet 18 [15] is pre-trained. The dark blue arrow indicates bilinear interpolation from multi-scale outputs to original scaled outputs. The colored lines are used to indicate correspondence between output data and loss function (red for *l_ap*, blue for *l_d*, green for *l_s*).

**Fig. 2. The 3DD module architecture.**
The orange and purple lines represent the inputs and outputs respectively.

**Fig. 3**
Qualitative result comparison between our method, packnet [12], monodepth2 [11], monodepth [13]. The first column contains example test images. The other columns are the corresponding disparity maps.

**Fig. 4**
The acquired ground truth depth maps via da Vinci (Intuitive Inc.) stereo laparoscope and projected gray-code structured light pattern [20].

**Fig. 5. The effect of view-field masking is shown in red boxes.**

See this image and copyright information in PMC

References

1. Zhang K. Minimally invasive surgery. Endoscopy. 2002 - PubMed
1. Zhang V, Melis M, Amato B, Bianco T, Rocca A, Amato M, Quarto G, Benassai G. Minimally invasive radioguided parathy- roid surgery: A literature review. IJS. 2016 - PubMed
1. Westebring-van der Putten EP, Goossens RH, Jakimowicz JJ, Dankelman J. Haptics in minimally invasive surgery–a review. Minimally Invasive Therapy & Allied Technologies. 2008 - PubMed
1. Zhang L, Li X, Yang S, Ding S, Jolfaei A, Zheng X. Unsupervised learning-based continuous depth and motion estimation with monocular endoscopy for virtual reality minimally invasive surgery. TII. 2020
1. Zhang S, Sinha A, Reiter A, Ishii M, Gallia GL, Taylor RH, Hager GD. Evaluation and stability analysis of video-based navigation system for functional endoscopic sinus surgery on in vivo clinical data. TMI. 2018 - PubMed

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Self-supervised Monocular Depth Estimation with 3D Displacement Module for Laparoscopic Images

Affiliation

Self-supervised Monocular Depth Estimation with 3D Displacement Module for Laparoscopic Images

Authors

Affiliation

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources