Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Feb:84:102709.
doi: 10.1016/j.media.2022.102709. Epub 2022 Dec 14.

Robust endoscopic image mosaicking via fusion of multimodal estimation

Affiliations

Robust endoscopic image mosaicking via fusion of multimodal estimation

Liang Li et al. Med Image Anal. 2023 Feb.

Abstract

We propose an endoscopic image mosaicking algorithm that is robust to light conditioning changes, specular reflections, and feature-less scenes. These conditions are especially common in minimally invasive surgery where the light source moves with the camera to dynamically illuminate close range scenes. This makes it difficult for a single image registration method to robustly track camera motion and then generate consistent mosaics of the expanded surgical scene across different and heterogeneous environments. Instead of relying on one specialised feature extractor or image registration method, we propose to fuse different image registration algorithms according to their uncertainties, formulating the problem as affine pose graph optimisation. This allows to combine landmarks, dense intensity registration, and learning-based approaches in a single framework. To demonstrate our application we consider deep learning-based optical flow, hand-crafted features, and intensity-based registration, however, the framework is general and could take as input other sources of motion estimation, including other sensor modalities. We validate the performance of our approach on three datasets with very different characteristics to highlighting its generalisability, demonstrating the advantages of our proposed fusion framework. While each individual registration algorithm eventually fails drastically on certain surgical scenes, the fusion approach flexibly determines which algorithms to use and in which proportion to more robustly obtain consistent mosaics.

Keywords: Endoscopic image mosaicking; Image mosaicking; Medical image processing; Optical flow; Pose graph optimisation.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
The diagram of the proposed method. There are three component homography estimation algorithms, i.e., SIFT-based, direct registration-based, and the optical flow-based. The pose graph is constructed based on the three estimation sources with their own uncertainties respectively. The optimal state is obtained by optimising the cost function in the affine Lie group. Finally, the panorama can be generated with the optimal homography matrices.
Fig. 2
Fig. 2
An example of the results of optical flow prediction and correspondence establishment. (a) and (b) show the two input images, and (c) is the predicted flow field by the Flownet2.0, where the colour coding scheme is shown in (d). The correspondence can be established using Eq. (3). In theory, the correspondence is very dense as correspondence for most pixels can be computed except ones close to the image border. Only a small portion of the correspondence is presented in (e) for a better visualisation.
Fig. 3
Fig. 3
An illustration of the pose graph that is constructed using the optical flow, SIFT, direct registration, and loop closure detection. The nodes are denoted in blue triangles. And the different types of edges are denoted in lines with different colours.
Fig. 4
Fig. 4
Examples of mosaicking directly obtained from using the robot kinematics, extracted from seq. 1 (a) and seq. 5 (b) of the SCARED dataset. The kinematics are not accurate enough to generate mosaics.
Fig. 5
Fig. 5
Results on the SCARED dataset. Mosaicking results for five sequences are presented from the first to the last row. The SIFT, direct registration, optical flow, and fusion-based mosaicking are presented from the first to the fourth column. The problematic parts of the panorama are denoted in blue, orange, and green rectangles from the first to the third column. The fusion-based mosaicking can correct them and combine advantages of the component methods to give high-quality panoramas.
Fig. 6
Fig. 6
Results on the fetoscopy dataset. Mosaicking results for six sequences are presented from the first to the last row. The SIFT, direct registration, optical flow, and fusion-based mosaicking are presented from the first to the fourth column. SIFT-based method fails to work on this dataset due to the texture-less background and difficulty to extract enough features. The fusion-based method fuses results of the direct registration-based and optical flow-based homography estimation, and can combine the advantages of both methods to generate better panoramas.
Fig. 7
Fig. 7
Results on the human cadaver dataset. Mosaicking results for five sequences are presented from the first to the last row. The SIFT, direct registration, optical flow, and fusion-based mosaicking are presented from the first to the fourth column. From the first to the fourth sequence, only the optical flow works among the three component methods. And the result of fusion is same as that of optical-flow mosaicking. For the fifth sequence, the fusion-based method fuses the results of SIFT-based and optical flow-based homography estimation using the affine pose graph, to yield a more consistent panorama.
Fig. 8
Fig. 8
Mosaics generated by simple mean fusion of the SIFT-based, direct registration-based, and the optical flow-based estimation.
Fig. 9
Fig. 9
SSIM between overlapping registered frames with distance between 1 (consecutive) and 5. Each boxplot shows SSIM results of all frame pairs in a video with specified distance. Lower values denote poorer methods.
Fig. 10
Fig. 10
A comparison of mosaicking generated by fusion with and without loop closure on sequence 2 of the fetoscopy dataset.
None

References

    1. Allan M., Mcleod J., Wang C.C., Rosenthal J.C., Fu K.X., Zeffiro T., Xia W., Zhanshi Z., Luo H., Zhang X., et al. 2021. Stereo correspondence and reconstruction of endoscopic data challenge. arXiv preprint arXiv:2101.01133.
    1. Bano S., Vasconcelos F., Amo M.T., Dwyer G., Gruijthuijsen C., Deprest J., Ourselin S., Vander Poorten E., Vercauteren T., Stoyanov D. Proc. Int. Conf. on Medical Image Computing and Computer-Assisted Intervention. Springer; 2019. Deep sequential mosaicking of fetoscopic videos; pp. 311–319.
    1. Bano S., Vasconcelos F., Shepherd L.M., Vander Poorten E., Vercauteren T., Ourselin S., David A.L., Deprest J., Stoyanov D. Proc. Int. Conf. on Medical Image Computing and Computer-Assisted Intervention. Springer; 2020. Deep placental vessel segmentation for fetoscopic mosaicking; pp. 763–773.
    1. Bartoli A. Groupwise geometric and photometric direct image registration. IEEE Trans. Pattern Anal. Mach. Intell. 2008;30(12):2098–2108. - PubMed
    1. Baum Z.M., Hu Y., Barratt D.C. Real-time multimodal image registration with partial intraoperative point-set data. Med. Image Anal. 2021 - PMC - PubMed

Publication types